2015-07-03 09:01:31

by Chen, Tiejun

[permalink] [raw]
Subject: [[PATCH 1/2] kvm: make preempt_notifier free out CONFIG_PREEMPT_NOTIFIERS

In any cases CONFIG_KVM always means CONFIG_PREEMPT_NOTIFIERS
is enabled unconditionally so its really pointless.

CC: Gleb Natapov <[email protected]>
CC: Paolo Bonzini <[email protected]>
Signed-off-by: Tiejun Chen <[email protected]>
---
include/linux/kvm_host.h | 2 --
1 file changed, 2 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9564fd7..695f4f3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -220,9 +220,7 @@ struct kvm_mmio_fragment {

struct kvm_vcpu {
struct kvm *kvm;
-#ifdef CONFIG_PREEMPT_NOTIFIERS
struct preempt_notifier preempt_notifier;
-#endif
int cpu;
int vcpu_id;
int srcu_idx;
--
1.9.1


2015-07-03 09:01:48

by Chen, Tiejun

[permalink] [raw]
Subject: [[PATCH 2/2] kvm: enable preemption to register/unregister preempt notifier

After commit 1cde2930e154 ("sched/preempt: Add static_key() to
preempt_notifiers") is introduced, preempt_notifier_{register, unregister}
always hold a mutex, jump_label_mutex. So in current case this shouldn't
work further under the circumstance of disabled preemption, and its also
safe since we're just handling a per-vcpu stuff with holding vcpu->mutex.
Otherwise, some warning messages are posted like this,

BUG: scheduling while atomic: qemu-system-x86/17177/0x00000002
2 locks held by qemu-system-x86/17177:
#0: (&vcpu->mutex){+.+.+.}, at: [<ffffffffa035fb48>] vcpu_load+0x28/0xf0 [kvm]
#1: (jump_label_mutex){+.+.+.}, at: [<ffffffff81244b54>] static_key_slow_inc+0xc4/0x140
Modules linked in: x86_pkg_temp_thermal kvm_intel kvm
Preemption disabled at:[<ffffffffa035fd3e>] kvm_vcpu_ioctl+0x7e/0xeb0 [kvm]

CPU: 2 PID: 17177 Comm: qemu-system-x86 Tainted: G W 4.1.0+ #30
Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A05 12/05/2013
0000000000200206 ffff8801c584bc38 ffffffff81f974ab 0000000000000003
ffff880211289a80 ffff8801c584bc58 ffffffff81f8fd3e 0000000000000001
ffff8802161d5d00 ffff8801c584bcb8 ffffffff81fa43dc ffff8801c584bd68
Call Trace:
[<ffffffff81f974ab>] dump_stack+0x95/0xf2
[<ffffffff81f8fd3e>] __schedule_bug+0x108/0x126
[<ffffffff81fa43dc>] __schedule+0x12dc/0x1590
[<ffffffff81fa4965>] schedule+0x75/0x150
[<ffffffff81fa7683>] ? mutex_lock_nested+0x393/0x780
[<ffffffff81fa4db0>] schedule_preempt_disabled+0x30/0x60
[<ffffffff81fa753a>] mutex_lock_nested+0x24a/0x780
[<ffffffff81244b54>] ? static_key_slow_inc+0xc4/0x140
[<ffffffff81244b54>] ? static_key_slow_inc+0xc4/0x140
[<ffffffffa035fb48>] ? vcpu_load+0x28/0xf0 [kvm]
[<ffffffff81244b54>] static_key_slow_inc+0xc4/0x140
[<ffffffff810d36a5>] preempt_notifier_register+0x25/0x70
[<ffffffffa035fb96>] vcpu_load+0x76/0xf0 [kvm]
[<ffffffffa035fd3e>] kvm_vcpu_ioctl+0x7e/0xeb0 [kvm]
[<ffffffff8110bc40>] ? __lock_is_held+0x70/0xa0
[<ffffffff810dde79>] ? get_parent_ip+0x19/0x90
[<ffffffff8130e4b4>] do_vfs_ioctl+0x3c4/0x910
[<ffffffff81320a01>] ? expand_files+0x311/0x360
[<ffffffff815370af>] ? selinux_file_ioctl+0x6f/0x150
[<ffffffff8130eaad>] SyS_ioctl+0xad/0xe0
[<ffffffff81faf857>] entry_SYSCALL_64_fastpath+0x12/0x6f

CC: Gleb Natapov <[email protected]>
CC: Paolo Bonzini <[email protected]>
Signed-off-by: Tiejun Chen <[email protected]>
---
virt/kvm/kvm_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 848af90..bde5f66f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -127,8 +127,8 @@ int vcpu_load(struct kvm_vcpu *vcpu)

if (mutex_lock_killable(&vcpu->mutex))
return -EINTR;
- cpu = get_cpu();
preempt_notifier_register(&vcpu->preempt_notifier);
+ cpu = get_cpu();
kvm_arch_vcpu_load(vcpu, cpu);
put_cpu();
return 0;
@@ -138,8 +138,8 @@ void vcpu_put(struct kvm_vcpu *vcpu)
{
preempt_disable();
kvm_arch_vcpu_put(vcpu);
- preempt_notifier_unregister(&vcpu->preempt_notifier);
preempt_enable();
+ preempt_notifier_unregister(&vcpu->preempt_notifier);
mutex_unlock(&vcpu->mutex);
}

--
1.9.1

2015-07-03 11:24:40

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [[PATCH 2/2] kvm: enable preemption to register/unregister preempt notifier

On 03/07/2015 10:56, Tiejun Chen wrote:
> After commit 1cde2930e154 ("sched/preempt: Add static_key() to
> preempt_notifiers") is introduced, preempt_notifier_{register, unregister}
> always hold a mutex, jump_label_mutex. So in current case this shouldn't
> work further under the circumstance of disabled preemption, and its also
> safe since we're just handling a per-vcpu stuff with holding vcpu->mutex.
> Otherwise, some warning messages are posted like this,
>
> BUG: scheduling while atomic: qemu-system-x86/17177/0x00000002
> 2 locks held by qemu-system-x86/17177:
> #0: (&vcpu->mutex){+.+.+.}, at: [<ffffffffa035fb48>] vcpu_load+0x28/0xf0 [kvm]
> #1: (jump_label_mutex){+.+.+.}, at: [<ffffffff81244b54>] static_key_slow_inc+0xc4/0x140
> Modules linked in: x86_pkg_temp_thermal kvm_intel kvm
> Preemption disabled at:[<ffffffffa035fd3e>] kvm_vcpu_ioctl+0x7e/0xeb0 [kvm]

Thanks for your work Tiejun. However, the original patch is crap. I've
asked to revert it.

Paolo

> CPU: 2 PID: 17177 Comm: qemu-system-x86 Tainted: G W 4.1.0+ #30
> Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A05 12/05/2013
> 0000000000200206 ffff8801c584bc38 ffffffff81f974ab 0000000000000003
> ffff880211289a80 ffff8801c584bc58 ffffffff81f8fd3e 0000000000000001
> ffff8802161d5d00 ffff8801c584bcb8 ffffffff81fa43dc ffff8801c584bd68
> Call Trace:
> [<ffffffff81f974ab>] dump_stack+0x95/0xf2
> [<ffffffff81f8fd3e>] __schedule_bug+0x108/0x126
> [<ffffffff81fa43dc>] __schedule+0x12dc/0x1590
> [<ffffffff81fa4965>] schedule+0x75/0x150
> [<ffffffff81fa7683>] ? mutex_lock_nested+0x393/0x780
> [<ffffffff81fa4db0>] schedule_preempt_disabled+0x30/0x60
> [<ffffffff81fa753a>] mutex_lock_nested+0x24a/0x780
> [<ffffffff81244b54>] ? static_key_slow_inc+0xc4/0x140
> [<ffffffff81244b54>] ? static_key_slow_inc+0xc4/0x140
> [<ffffffffa035fb48>] ? vcpu_load+0x28/0xf0 [kvm]
> [<ffffffff81244b54>] static_key_slow_inc+0xc4/0x140
> [<ffffffff810d36a5>] preempt_notifier_register+0x25/0x70
> [<ffffffffa035fb96>] vcpu_load+0x76/0xf0 [kvm]
> [<ffffffffa035fd3e>] kvm_vcpu_ioctl+0x7e/0xeb0 [kvm]
> [<ffffffff8110bc40>] ? __lock_is_held+0x70/0xa0
> [<ffffffff810dde79>] ? get_parent_ip+0x19/0x90
> [<ffffffff8130e4b4>] do_vfs_ioctl+0x3c4/0x910
> [<ffffffff81320a01>] ? expand_files+0x311/0x360
> [<ffffffff815370af>] ? selinux_file_ioctl+0x6f/0x150
> [<ffffffff8130eaad>] SyS_ioctl+0xad/0xe0
> [<ffffffff81faf857>] entry_SYSCALL_64_fastpath+0x12/0x6f
>
> CC: Gleb Natapov <[email protected]>
> CC: Paolo Bonzini <[email protected]>
> Signed-off-by: Tiejun Chen <[email protected]>
> ---
> virt/kvm/kvm_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 848af90..bde5f66f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -127,8 +127,8 @@ int vcpu_load(struct kvm_vcpu *vcpu)
>
> if (mutex_lock_killable(&vcpu->mutex))
> return -EINTR;
> - cpu = get_cpu();
> preempt_notifier_register(&vcpu->preempt_notifier);
> + cpu = get_cpu();
> kvm_arch_vcpu_load(vcpu, cpu);
> put_cpu();
> return 0;
> @@ -138,8 +138,8 @@ void vcpu_put(struct kvm_vcpu *vcpu)
> {
> preempt_disable();
> kvm_arch_vcpu_put(vcpu);
> - preempt_notifier_unregister(&vcpu->preempt_notifier);
> preempt_enable();
> + preempt_notifier_unregister(&vcpu->preempt_notifier);
> mutex_unlock(&vcpu->mutex);
> }
>
>

2015-07-06 00:46:19

by Chen, Tiejun

[permalink] [raw]
Subject: Re: [[PATCH 2/2] kvm: enable preemption to register/unregister preempt notifier

On 2015/7/3 19:23, Paolo Bonzini wrote:
> On 03/07/2015 10:56, Tiejun Chen wrote:
>> After commit 1cde2930e154 ("sched/preempt: Add static_key() to
>> preempt_notifiers") is introduced, preempt_notifier_{register, unregister}
>> always hold a mutex, jump_label_mutex. So in current case this shouldn't
>> work further under the circumstance of disabled preemption, and its also
>> safe since we're just handling a per-vcpu stuff with holding vcpu->mutex.
>> Otherwise, some warning messages are posted like this,
>>
>> BUG: scheduling while atomic: qemu-system-x86/17177/0x00000002
>> 2 locks held by qemu-system-x86/17177:
>> #0: (&vcpu->mutex){+.+.+.}, at: [<ffffffffa035fb48>] vcpu_load+0x28/0xf0 [kvm]
>> #1: (jump_label_mutex){+.+.+.}, at: [<ffffffff81244b54>] static_key_slow_inc+0xc4/0x140
>> Modules linked in: x86_pkg_temp_thermal kvm_intel kvm
>> Preemption disabled at:[<ffffffffa035fd3e>] kvm_vcpu_ioctl+0x7e/0xeb0 [kvm]
>
> Thanks for your work Tiejun. However, the original patch is crap. I've
> asked to revert it.
>

Yeah, its better to revert that commit since finally this also trigger a
bug 100671: vmwrite error in vmx_vcpu_run.

Thanks
Tiejun