2015-02-20 22:41:28

by Joel Schopp

[permalink] [raw]
Subject: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write

From: David Kaplan <[email protected]>

Reduce the number of exits by avoiding exiting when the guest writes TS or MP
bits of CR0. INTERCEPT_CR0_WRITE intercepts all writes to CR0 including TS and
MP bits. It intercepts these even if INTERCEPT_SELECTIVE_CR0 is set. What we
should be doing is setting INTERCEPT_SELECTIVE_CR0 and not setting
INTERCEPT_CR0_WRITE.

Signed-off-by: David Kaplan <[email protected]>
[added remove of clr_cr_intercept in init_vmcb, fixed check in handle_exit,
added emulation on interception back in, forward ported, tested]
Signed-off-by: Joel Schopp <[email protected]>
---
arch/x86/kvm/svm.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d319e0c..55822e5 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1093,7 +1093,6 @@ static void init_vmcb(struct vcpu_svm *svm)
set_cr_intercept(svm, INTERCEPT_CR0_READ);
set_cr_intercept(svm, INTERCEPT_CR3_READ);
set_cr_intercept(svm, INTERCEPT_CR4_READ);
- set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
set_cr_intercept(svm, INTERCEPT_CR3_WRITE);
set_cr_intercept(svm, INTERCEPT_CR4_WRITE);
set_cr_intercept(svm, INTERCEPT_CR8_WRITE);
@@ -1539,10 +1538,8 @@ static void update_cr0_intercept(struct vcpu_svm *svm)

if (gcr0 == *hcr0 && svm->vcpu.fpu_active) {
clr_cr_intercept(svm, INTERCEPT_CR0_READ);
- clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
} else {
set_cr_intercept(svm, INTERCEPT_CR0_READ);
- set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
}
}

@@ -2940,7 +2937,11 @@ static int cr_interception(struct vcpu_svm *svm)
return emulate_on_interception(svm);

reg = svm->vmcb->control.exit_info_1 & SVM_EXITINFO_REG_MASK;
- cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
+
+ if (svm->vmcb->control.exit_code == SVM_EXIT_CR0_SEL_WRITE)
+ cr = 16;
+ else
+ cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;

err = 0;
if (cr >= 16) { /* mov to cr */
@@ -3325,7 +3326,7 @@ static int (*const svm_exit_handlers[])(struct vcpu_svm *svm) = {
[SVM_EXIT_READ_CR3] = cr_interception,
[SVM_EXIT_READ_CR4] = cr_interception,
[SVM_EXIT_READ_CR8] = cr_interception,
- [SVM_EXIT_CR0_SEL_WRITE] = emulate_on_interception,
+ [SVM_EXIT_CR0_SEL_WRITE] = cr_interception,
[SVM_EXIT_WRITE_CR0] = cr_interception,
[SVM_EXIT_WRITE_CR3] = cr_interception,
[SVM_EXIT_WRITE_CR4] = cr_interception,
@@ -3502,7 +3503,7 @@ static int handle_exit(struct kvm_vcpu *vcpu)
struct kvm_run *kvm_run = vcpu->run;
u32 exit_code = svm->vmcb->control.exit_code;

- if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
+ if (!is_cr_intercept(svm, INTERCEPT_SELECTIVE_CR0))
vcpu->arch.cr0 = svm->vmcb->save.cr0;
if (npt_enabled)
vcpu->arch.cr3 = svm->vmcb->save.cr3;


2015-02-23 19:18:05

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write

2015-02-20 16:44-0600, Joel Schopp:
> From: David Kaplan <[email protected]>
>
> Reduce the number of exits by avoiding exiting when the guest writes TS or MP
> bits of CR0. INTERCEPT_CR0_WRITE intercepts all writes to CR0 including TS and
> MP bits. It intercepts these even if INTERCEPT_SELECTIVE_CR0 is set. What we
> should be doing is setting INTERCEPT_SELECTIVE_CR0 and not setting
> INTERCEPT_CR0_WRITE.
>
> Signed-off-by: David Kaplan <[email protected]>
> [added remove of clr_cr_intercept in init_vmcb, fixed check in handle_exit,
> added emulation on interception back in, forward ported, tested]
> Signed-off-by: Joel Schopp <[email protected]>
> ---
> arch/x86/kvm/svm.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index d319e0c..55822e5 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1539,10 +1538,8 @@ static void update_cr0_intercept(struct vcpu_svm *svm)
>
> if (gcr0 == *hcr0 && svm->vcpu.fpu_active) {
> clr_cr_intercept(svm, INTERCEPT_CR0_READ);


> - clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
> } else {
> set_cr_intercept(svm, INTERCEPT_CR0_READ);

(There is no point in checking fpu_active if cr0s are equal.)

> - set_cr_intercept(svm, INTERCEPT_CR0_WRITE);

KVM uses lazy FPU and the state is undefined before the first access.
We set cr0.ts when !svm->vcpu.fpu_active to detect the first access, but
if we allow the guest to clear cr0.ts without exiting, it can access FPU
with undefined state.

Is this code failing to disable the intercept when FPU is active?

> @@ -2940,7 +2937,11 @@ static int cr_interception(struct vcpu_svm *svm)
> + if (svm->vmcb->control.exit_code == SVM_EXIT_CR0_SEL_WRITE)
> + cr = 16;
^^^

> + else
> + cr = svm->vmcb->control.exit_code - SVM_EXIT_READ_CR0;
^^^

Linux uses tabs for indentation.

> @@ -3502,7 +3503,7 @@ static int handle_exit(struct kvm_vcpu *vcpu)
> struct kvm_run *kvm_run = vcpu->run;
> u32 exit_code = svm->vmcb->control.exit_code;
>
> - if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE))
> + if (!is_cr_intercept(svm, INTERCEPT_SELECTIVE_CR0))
> vcpu->arch.cr0 = svm->vmcb->save.cr0;

I think the purpose of this code is to get changes that happened while
we weren't monitoring CR0, and we introduce a bug here ... MP and TS can
change, but those changes are lost to arch.cr0 now.

(The original code was also suspicious -- we propagate changes of
vmcb->save.cr0. I don't think we want to clear guest's CD and NW /
set MP and TS, which is the result of what we do in svm_set_cr0() /
update_cr0_intercept() ...)

2015-02-24 21:58:27

by Joel Schopp

[permalink] [raw]
Subject: Re: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write


>> - clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
>> } else {
>> set_cr_intercept(svm, INTERCEPT_CR0_READ);
> (There is no point in checking fpu_active if cr0s are equal.)
>
>> - set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
> KVM uses lazy FPU and the state is undefined before the first access.
> We set cr0.ts when !svm->vcpu.fpu_active to detect the first access, but
> if we allow the guest to clear cr0.ts without exiting, it can access FPU
> with undefined state.
Thanks for the valuable feedback. It's apparent I hadn't thought
through the interaction with lazy FPU and will need to go back and
rethink my approach here.

2015-02-25 20:26:28

by Radim Krčmář

[permalink] [raw]
Subject: Re: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write

2015-02-24 15:25-0600, Joel Schopp:
>
> >> - clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
> >> } else {
> >> set_cr_intercept(svm, INTERCEPT_CR0_READ);
> > (There is no point in checking fpu_active if cr0s are equal.)
> >
> >> - set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
> > KVM uses lazy FPU and the state is undefined before the first access.
> > We set cr0.ts when !svm->vcpu.fpu_active to detect the first access, but
> > if we allow the guest to clear cr0.ts without exiting, it can access FPU
> > with undefined state.
> Thanks for the valuable feedback. It's apparent I hadn't thought
> through the interaction with lazy FPU and will need to go back and
> rethink my approach here.

I don't think we can gain much without sacrificing some laziness, like:
when a guest with lazy FPU clears CR0.TS, it is going to use that FPU,
so we could pre-load FPU in this case and drop the write intercept too;
guests that unconditionally clear CR0.TS would perform worse though.

It's going to take a lot of time, but two hunks in your patch, that made
selective intercept benefit from decode assists, look useful even now.

Would you post them separately?

Thanks.

2015-02-25 22:40:04

by Joel Schopp

[permalink] [raw]
Subject: Re: [PATCH] x86: svm: don't intercept CR0 TS or MP bit write


On 02/25/2015 02:26 PM, Radim Krčmář wrote:
> 2015-02-24 15:25-0600, Joel Schopp:
>>>> - clr_cr_intercept(svm, INTERCEPT_CR0_WRITE);
>>>> } else {
>>>> set_cr_intercept(svm, INTERCEPT_CR0_READ);
>>> (There is no point in checking fpu_active if cr0s are equal.)
>>>
>>>> - set_cr_intercept(svm, INTERCEPT_CR0_WRITE);
>>> KVM uses lazy FPU and the state is undefined before the first access.
>>> We set cr0.ts when !svm->vcpu.fpu_active to detect the first access, but
>>> if we allow the guest to clear cr0.ts without exiting, it can access FPU
>>> with undefined state.
>> Thanks for the valuable feedback. It's apparent I hadn't thought
>> through the interaction with lazy FPU and will need to go back and
>> rethink my approach here.
> I don't think we can gain much without sacrificing some laziness, like:
> when a guest with lazy FPU clears CR0.TS, it is going to use that FPU,
> so we could pre-load FPU in this case and drop the write intercept too;
> guests that unconditionally clear CR0.TS would perform worse though.
>
> It's going to take a lot of time, but two hunks in your patch, that made
> selective intercept benefit from decode assists, look useful even now.
>
> Would you post them separately?
I can re-post those separately. They are less useful, though probably
still worth doing, on their own because SVM_EXIT_WRITE_CR0 takes
precidence over SVM_EXIT_CR0_SEL_WRITE