2018-03-22 08:35:53

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

From: Wanpeng Li <[email protected]>

Explicit segment overides other than %fs and %gs are documented as ignored by
both Intel and AMD.

In practice, this means that:

* Explicit uses of %ss don't actually yield #SS[0] for non-canonical
memory references.
* Explicit uses of %{e,c,d}s don't override %rbp/%rsp-based memory references
to yield #GP[0] for non-canonical memory references.

Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
arch/x86/kvm/emulate.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index dd88158..5091255 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5148,8 +5148,10 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
case 0x2e: /* CS override */
case 0x36: /* SS override */
case 0x3e: /* DS override */
- has_seg_override = true;
- ctxt->seg_override = (ctxt->b >> 3) & 3;
+ if (mode != X86EMUL_MODE_PROT64) {
+ has_seg_override = true;
+ ctxt->seg_override = (ctxt->b >> 3) & 3;
+ }
break;
case 0x64: /* FS override */
case 0x65: /* GS override */
--
2.7.4



2018-03-22 10:21:05

by Andrew Cooper

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 22/03/2018 10:07, Paolo Bonzini wrote:
> On 22/03/2018 09:34, Wanpeng Li wrote:
>> From: Wanpeng Li <[email protected]>
>>
>> Explicit segment overides other than %fs and %gs are documented as ignored by
>> both Intel and AMD.
>>
>> In practice, this means that:
>>
>> * Explicit uses of %ss don't actually yield #SS[0] for non-canonical
>> memory references.
>> * Explicit uses of %{e,c,d}s don't override %rbp/%rsp-based memory references
>> to yield #GP[0] for non-canonical memory references.
>>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: Radim Krčmář <[email protected]>
>> Signed-off-by: Wanpeng Li <[email protected]>

When porting fixes from other projects, it is customary to identify so
in the commit message.  In this case, the fix you've ported is
http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=b7dce29d9faf3597d009c853ed1fcbed9f7a7f68

Here is an example of how Xen ports fixes from Linux for the drivers
that we share. 
http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=4e131596f1defec9407b6e60d584a696beaf5d7e


>> ---
>> arch/x86/kvm/emulate.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index dd88158..5091255 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -5148,8 +5148,10 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
>> case 0x2e: /* CS override */
>> case 0x36: /* SS override */
>> case 0x3e: /* DS override */
>> - has_seg_override = true;
>> - ctxt->seg_override = (ctxt->b >> 3) & 3;
>> + if (mode != X86EMUL_MODE_PROT64) {
>> + has_seg_override = true;
>> + ctxt->seg_override = (ctxt->b >> 3) & 3;
>> + }
>> break;
>> case 0x64: /* FS override */
>> case 0x65: /* GS override */
>>
> Testcase, please...

If you want to crib from one, this is the testcase I made for Xen.

http://xenbits.xen.org/docs/xtf/test-memop-seg.html

With the impending KVM/PVH work which is ongoing, it will soon be easy
to run Xen's HVM test suite unmodified under KVM, but we're not quite
there yet.

~Andrew

2018-03-22 10:53:12

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 22/03/2018 11:19, Andrew Cooper wrote:
> On 22/03/2018 10:07, Paolo Bonzini wrote:
>> On 22/03/2018 09:34, Wanpeng Li wrote:
>>> From: Wanpeng Li <[email protected]>
>>>
>>> Explicit segment overides other than %fs and %gs are documented as ignored by
>>> both Intel and AMD.
>>>
>>> In practice, this means that:
>>>
>>> * Explicit uses of %ss don't actually yield #SS[0] for non-canonical
>>> memory references.
>>> * Explicit uses of %{e,c,d}s don't override %rbp/%rsp-based memory references
>>> to yield #GP[0] for non-canonical memory references.
>>>
>>> Cc: Paolo Bonzini <[email protected]>
>>> Cc: Radim Krčmář <[email protected]>
>>> Signed-off-by: Wanpeng Li <[email protected]>
>
> When porting fixes from other projects, it is customary to identify so
> in the commit message.  In this case, the fix you've ported is
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=b7dce29d9faf3597d009c853ed1fcbed9f7a7f68
>
> Here is an example of how Xen ports fixes from Linux for the drivers
> that we share. 
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=4e131596f1defec9407b6e60d584a696beaf5d7e

Thanks Andrew. The code is of course completely different, but the
commit message is 1:1. Wanpeng, please acknowledge Jan in v2!

>> Testcase, please...
>
> If you want to crib from one, this is the testcase I made for Xen.
>
> http://xenbits.xen.org/docs/xtf/test-memop-seg.html

How does it ensure that the code is executed through the emulator and
not by the processor?

> With the impending KVM/PVH work which is ongoing, it will soon be easy
> to run Xen's HVM test suite unmodified under KVM, but we're not quite
> there yet.

What does the test suite use for console I/O?

Paolo

2018-03-22 11:06:16

by Andrew Cooper

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 22/03/2018 10:42, Paolo Bonzini wrote:
> On 22/03/2018 11:19, Andrew Cooper wrote:
>> On 22/03/2018 10:07, Paolo Bonzini wrote:
>>> On 22/03/2018 09:34, Wanpeng Li wrote:
>>>> From: Wanpeng Li <[email protected]>
>>>>
>>>> Explicit segment overides other than %fs and %gs are documented as ignored by
>>>> both Intel and AMD.
>>>>
>>>> In practice, this means that:
>>>>
>>>> * Explicit uses of %ss don't actually yield #SS[0] for non-canonical
>>>> memory references.
>>>> * Explicit uses of %{e,c,d}s don't override %rbp/%rsp-based memory references
>>>> to yield #GP[0] for non-canonical memory references.
>>>>
>>>> Cc: Paolo Bonzini <[email protected]>
>>>> Cc: Radim Krčmář <[email protected]>
>>>> Signed-off-by: Wanpeng Li <[email protected]>
>> When porting fixes from other projects, it is customary to identify so
>> in the commit message.  In this case, the fix you've ported is
>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=b7dce29d9faf3597d009c853ed1fcbed9f7a7f68
>>
>> Here is an example of how Xen ports fixes from Linux for the drivers
>> that we share. 
>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=4e131596f1defec9407b6e60d584a696beaf5d7e
> Thanks Andrew. The code is of course completely different, but the
> commit message is 1:1. Wanpeng, please acknowledge Jan in v2!

Thanks, but it is actually my patch, which is why I was confused at
seeing my own commit message on LKML.

Also, the chances are that there are similar issues decoding the
instruction info field in the VMCS, which is how I stumbled onto this in
the first place.  I haven't yet fixed that side of things for Xen.

>
>>> Testcase, please...
>> If you want to crib from one, this is the testcase I made for Xen.
>>
>> http://xenbits.xen.org/docs/xtf/test-memop-seg.html
> How does it ensure that the code is executed through the emulator and
> not by the processor?

This test, and most of the tests in general, deliberately set things up
to execute the test cases first on the main processor, and then via the
emulator, and complain when any result is unexpected.

We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
magic.  Originally, this was used for PV guests to explicitly request an
emulated CPUID, but I extended it to HVM guests for "emulate the next
instruction", after we had some guest user => guest kernel privilege
escalations because of incorrect emulation.

Fundamentally, any multi-vcpu guest can force an arbitrary instruction
through the emulator, because rewriting a couple of bytes of instruction
stream is far far far faster than a vmexit.  I chose to introduce a
explicit way to force this to occur, for testing purposes.

>
>> With the impending KVM/PVH work which is ongoing, it will soon be easy
>> to run Xen's HVM test suite unmodified under KVM, but we're not quite
>> there yet.
> What does the test suite use for console I/O?

Depends on what it available as it boots, but one of the default
consoles is port 0x12.  If things need to be tweaked to work more
cleanly, then that is entirely fine.

~Andrew

2018-03-22 11:24:42

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 22/03/2018 09:34, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> Explicit segment overides other than %fs and %gs are documented as ignored by
> both Intel and AMD.
>
> In practice, this means that:
>
> * Explicit uses of %ss don't actually yield #SS[0] for non-canonical
> memory references.
> * Explicit uses of %{e,c,d}s don't override %rbp/%rsp-based memory references
> to yield #GP[0] for non-canonical memory references.
>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Radim Krčmář <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> arch/x86/kvm/emulate.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index dd88158..5091255 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -5148,8 +5148,10 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
> case 0x2e: /* CS override */
> case 0x36: /* SS override */
> case 0x3e: /* DS override */
> - has_seg_override = true;
> - ctxt->seg_override = (ctxt->b >> 3) & 3;
> + if (mode != X86EMUL_MODE_PROT64) {
> + has_seg_override = true;
> + ctxt->seg_override = (ctxt->b >> 3) & 3;
> + }
> break;
> case 0x64: /* FS override */
> case 0x65: /* GS override */
>

Testcase, please...

Paolo

2018-03-22 11:34:59

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

2018-03-22 19:04 GMT+08:00 Andrew Cooper <[email protected]>:
> On 22/03/2018 10:42, Paolo Bonzini wrote:
>> On 22/03/2018 11:19, Andrew Cooper wrote:
>>> On 22/03/2018 10:07, Paolo Bonzini wrote:
>>>> On 22/03/2018 09:34, Wanpeng Li wrote:
>>>>> From: Wanpeng Li <[email protected]>
>>>>>
>>>>> Explicit segment overides other than %fs and %gs are documented as ignored by
>>>>> both Intel and AMD.
>>>>>
>>>>> In practice, this means that:
>>>>>
>>>>> * Explicit uses of %ss don't actually yield #SS[0] for non-canonical
>>>>> memory references.
>>>>> * Explicit uses of %{e,c,d}s don't override %rbp/%rsp-based memory references
>>>>> to yield #GP[0] for non-canonical memory references.
>>>>>
>>>>> Cc: Paolo Bonzini <[email protected]>
>>>>> Cc: Radim Krčmář <[email protected]>
>>>>> Signed-off-by: Wanpeng Li <[email protected]>
>>> When porting fixes from other projects, it is customary to identify so
>>> in the commit message. In this case, the fix you've ported is
>>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=b7dce29d9faf3597d009c853ed1fcbed9f7a7f68
>>>
>>> Here is an example of how Xen ports fixes from Linux for the drivers
>>> that we share.
>>> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=4e131596f1defec9407b6e60d584a696beaf5d7e
>> Thanks Andrew. The code is of course completely different, but the
>> commit message is 1:1. Wanpeng, please acknowledge Jan in v2!
>
> Thanks, but it is actually my patch, which is why I was confused at
> seeing my own commit message on LKML.

Thanks Andrew's original patch. Anyway, it is a really small stuff, I
will drop this patch to avoid the confusing.

Regards,
Wanpeng Li

>
> Also, the chances are that there are similar issues decoding the
> instruction info field in the VMCS, which is how I stumbled onto this in
> the first place. I haven't yet fixed that side of things for Xen.
>
>>
>>>> Testcase, please...
>>> If you want to crib from one, this is the testcase I made for Xen.
>>>
>>> http://xenbits.xen.org/docs/xtf/test-memop-seg.html
>> How does it ensure that the code is executed through the emulator and
>> not by the processor?
>
> This test, and most of the tests in general, deliberately set things up
> to execute the test cases first on the main processor, and then via the
> emulator, and complain when any result is unexpected.
>
> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
> magic. Originally, this was used for PV guests to explicitly request an
> emulated CPUID, but I extended it to HVM guests for "emulate the next
> instruction", after we had some guest user => guest kernel privilege
> escalations because of incorrect emulation.
>
> Fundamentally, any multi-vcpu guest can force an arbitrary instruction
> through the emulator, because rewriting a couple of bytes of instruction
> stream is far far far faster than a vmexit. I chose to introduce a
> explicit way to force this to occur, for testing purposes.
>
>>
>>> With the impending KVM/PVH work which is ongoing, it will soon be easy
>>> to run Xen's HVM test suite unmodified under KVM, but we're not quite
>>> there yet.
>> What does the test suite use for console I/O?
>
> Depends on what it available as it boots, but one of the default
> consoles is port 0x12. If things need to be tweaked to work more
> cleanly, then that is entirely fine.
>
> ~Andrew

2018-03-22 14:29:38

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 22/03/2018 12:04, Andrew Cooper wrote:
> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
> magic.  Originally, this was used for PV guests to explicitly request an
> emulated CPUID, but I extended it to HVM guests for "emulate the next
> instruction", after we had some guest user => guest kernel privilege
> escalations because of incorrect emulation.

Wanpeng, why don't you add it behind a new kvm module parameter? :)

Paolo

2018-03-22 14:31:57

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

2018-03-22 20:38 GMT+08:00 Paolo Bonzini <[email protected]>:
> On 22/03/2018 12:04, Andrew Cooper wrote:
>> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
>> magic. Originally, this was used for PV guests to explicitly request an
>> emulated CPUID, but I extended it to HVM guests for "emulate the next
>> instruction", after we had some guest user => guest kernel privilege
>> escalations because of incorrect emulation.
>
> Wanpeng, why don't you add it behind a new kvm module parameter? :)

Great point! I will have a try. Thanks Paolo and Andrew. :)

Regards,
Wanpeng Li

2018-03-22 14:49:27

by Andrew Cooper

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 22/03/18 13:39, Wanpeng Li wrote:
> 2018-03-22 20:38 GMT+08:00 Paolo Bonzini <[email protected]>:
>> On 22/03/2018 12:04, Andrew Cooper wrote:
>>> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
>>> magic. Originally, this was used for PV guests to explicitly request an
>>> emulated CPUID, but I extended it to HVM guests for "emulate the next
>>> instruction", after we had some guest user => guest kernel privilege
>>> escalations because of incorrect emulation.
>> Wanpeng, why don't you add it behind a new kvm module parameter? :)
> Great point! I will have a try. Thanks Paolo and Andrew. :)

Using the force emulation prefix requires intercepting #UD, which is in
general a BadThing(tm) for security.  Therefore, we have a build time
configuration option to compile in support, and require that test
systems explicitly opt into using it via a command line parameter.

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/hvm.c;h=db52312882205e65b32e587106ca795f4bfab2eb;hb=refs/heads/staging#l3741
is the general #UD intercept handler if you want a reference.  (You can
ignore the cross-vendor part, which is leftovers from
http://developer.amd.com/wordpress/media/2012/10/CrossVendorMigration.pdf )

~Andrew

2018-03-23 14:30:26

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

2018-03-22 21:53 GMT+08:00 Andrew Cooper <[email protected]>:
> On 22/03/18 13:39, Wanpeng Li wrote:
>> 2018-03-22 20:38 GMT+08:00 Paolo Bonzini <[email protected]>:
>>> On 22/03/2018 12:04, Andrew Cooper wrote:
>>>> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
>>>> magic. Originally, this was used for PV guests to explicitly request an
>>>> emulated CPUID, but I extended it to HVM guests for "emulate the next
>>>> instruction", after we had some guest user => guest kernel privilege
>>>> escalations because of incorrect emulation.
>>> Wanpeng, why don't you add it behind a new kvm module parameter? :)
>> Great point! I will have a try. Thanks Paolo and Andrew. :)
>
> Using the force emulation prefix requires intercepting #UD, which is in
> general a BadThing(tm) for security. Therefore, we have a build time

Yeah, however kvm intercepts and emulates #UD by default, should we
add a new kvm module parameter to enable it and disable by default?
Paolo.

> configuration option to compile in support, and require that test
> systems explicitly opt into using it via a command line parameter.
>
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/hvm.c;h=db52312882205e65b32e587106ca795f4bfab2eb;hb=refs/heads/staging#l3741
> is the general #UD intercept handler if you want a reference. (You can

Thanks Andrew, it is useful. :) In addition, I didn't see the
test-memop-seg testcase has "Forced Emulation Prefix", when the prefix
is added to each instruction in the testcase?

Regards,
Wanpeng Li

2018-03-23 14:45:03

by Andrew Cooper

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 23/03/18 14:27, Wanpeng Li wrote:
> 2018-03-22 21:53 GMT+08:00 Andrew Cooper <[email protected]>:
>> On 22/03/18 13:39, Wanpeng Li wrote:
>>> 2018-03-22 20:38 GMT+08:00 Paolo Bonzini <[email protected]>:
>>>> On 22/03/2018 12:04, Andrew Cooper wrote:
>>>>> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
>>>>> magic. Originally, this was used for PV guests to explicitly request an
>>>>> emulated CPUID, but I extended it to HVM guests for "emulate the next
>>>>> instruction", after we had some guest user => guest kernel privilege
>>>>> escalations because of incorrect emulation.
>>>> Wanpeng, why don't you add it behind a new kvm module parameter? :)
>>> Great point! I will have a try. Thanks Paolo and Andrew. :)
>> Using the force emulation prefix requires intercepting #UD, which is in
>> general a BadThing(tm) for security. Therefore, we have a build time
> Yeah, however kvm intercepts and emulates #UD by default, should we
> add a new kvm module parameter to enable it and disable by default?
> Paolo.
>
>> configuration option to compile in support, and require that test
>> systems explicitly opt into using it via a command line parameter.
>>
>> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/hvm.c;h=db52312882205e65b32e587106ca795f4bfab2eb;hb=refs/heads/staging#l3741
>> is the general #UD intercept handler if you want a reference. (You can
> Thanks Andrew, it is useful. :) In addition, I didn't see the
> test-memop-seg testcase has "Forced Emulation Prefix", when the prefix
> is added to each instruction in the testcase?

It has ended up substantially more ugly than I first intended, due to
several assembler bugs in older GCC and Clang toolchains.

http://xenbits.xen.org/gitweb/?p=xtf.git;a=blob;f=tests/memop-seg/asm.S;h=698661425bcdc9c181b235e323c2460e06c6e986;hb=HEAD#l35

I previously has FEP passed as a second parameter, but that becomes
prohibitively complicated to extract when testing %ss or %esp.  FEP is
now encoded in the bottom bit of the address passed in.

This was the cleanest way I could find of testing every combination, but
I'm open to improvements if anyone can spot any.

~Andrew

2018-03-23 15:06:43

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 23/03/2018 15:27, Wanpeng Li wrote:
> 2018-03-22 21:53 GMT+08:00 Andrew Cooper <[email protected]>:
>> On 22/03/18 13:39, Wanpeng Li wrote:
>>> 2018-03-22 20:38 GMT+08:00 Paolo Bonzini <[email protected]>:
>>>> On 22/03/2018 12:04, Andrew Cooper wrote:
>>>>> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
>>>>> magic. Originally, this was used for PV guests to explicitly request an
>>>>> emulated CPUID, but I extended it to HVM guests for "emulate the next
>>>>> instruction", after we had some guest user => guest kernel privilege
>>>>> escalations because of incorrect emulation.
>>>> Wanpeng, why don't you add it behind a new kvm module parameter? :)
>>> Great point! I will have a try. Thanks Paolo and Andrew. :)
>>
>> Using the force emulation prefix requires intercepting #UD, which is in
>> general a BadThing(tm) for security. Therefore, we have a build time
>
> Yeah, however kvm intercepts and emulates #UD by default, should we
> add a new kvm module parameter to enable it and disable by default?

No, the module parameter should only be about the force-emulation prefix.

Paolo

2018-03-26 12:26:37

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

2018-03-23 23:04 GMT+08:00 Paolo Bonzini <[email protected]>:
> On 23/03/2018 15:27, Wanpeng Li wrote:
>> 2018-03-22 21:53 GMT+08:00 Andrew Cooper <[email protected]>:
>>> On 22/03/18 13:39, Wanpeng Li wrote:
>>>> 2018-03-22 20:38 GMT+08:00 Paolo Bonzini <[email protected]>:
>>>>> On 22/03/2018 12:04, Andrew Cooper wrote:
>>>>>> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
>>>>>> magic. Originally, this was used for PV guests to explicitly request an
>>>>>> emulated CPUID, but I extended it to HVM guests for "emulate the next
>>>>>> instruction", after we had some guest user => guest kernel privilege
>>>>>> escalations because of incorrect emulation.
>>>>> Wanpeng, why don't you add it behind a new kvm module parameter? :)
>>>> Great point! I will have a try. Thanks Paolo and Andrew. :)
>>>
>>> Using the force emulation prefix requires intercepting #UD, which is in
>>> general a BadThing(tm) for security. Therefore, we have a build time
>>
>> Yeah, however kvm intercepts and emulates #UD by default, should we
>> add a new kvm module parameter to enable it and disable by default?
>
> No, the module parameter should only be about the force-emulation prefix.

How about something like this? (Add EmulateOnUD to cpuid, the testcase
will use it)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index dd88158..80da5c6 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4772,7 +4772,7 @@ static const struct opcode twobyte_table[256] = {
X16(D(ByteOp | DstMem | SrcNone | ModRM| Mov)),
/* 0xA0 - 0xA7 */
I(Stack | Src2FS, em_push_sreg), I(Stack | Src2FS, em_pop_sreg),
- II(ImplicitOps, em_cpuid, cpuid),
+ II(EmulateOnUD | ImplicitOps, em_cpuid, cpuid),
F(DstMem | SrcReg | ModRM | BitOp | NoWrite, em_bt),
F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shld),
F(DstMem | SrcReg | Src2CL | ModRM, em_shld), N, N,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9bc05f5..1825b45 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -108,6 +108,9 @@ module_param_named(enable_shadow_vmcs,
enable_shadow_vmcs, bool, S_IRUGO);
static bool __read_mostly nested = 0;
module_param(nested, bool, S_IRUGO);

+static bool __read_mostly fep = 0;
+module_param(fep, bool, S_IRUGO);
+
static u64 __read_mostly host_xss;

static bool __read_mostly enable_pml = 1;
@@ -6215,6 +6218,27 @@ static int handle_machine_check(struct kvm_vcpu *vcpu)
return 1;
}

+static int handle_ud(struct kvm_vcpu *vcpu)
+{
+ enum emulation_result er;
+
+ if (fep) {
+ char sig[5]; /* ud2; .ascii "kvm" */
+ struct x86_exception e;
+
+ kvm_read_guest_virt(&vcpu->arch.emulate_ctxt,
+ kvm_get_linear_rip(vcpu), sig, sizeof(sig), &e);
+ if (memcmp(sig, "\xf\xbkvm", sizeof(sig)) == 0)
+ kvm_rip_write(vcpu, kvm_rip_read(vcpu) + sizeof(sig));
+ }
+ er = emulate_instruction(vcpu, EMULTYPE_TRAP_UD);
+ if (er == EMULATE_USER_EXIT)
+ return 0;
+ if (er != EMULATE_DONE)
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return 1;
+}
+
static int handle_exception(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -6233,14 +6257,8 @@ static int handle_exception(struct kvm_vcpu *vcpu)
if (is_nmi(intr_info))
return 1; /* already handled by vmx_vcpu_run() */

- if (is_invalid_opcode(intr_info)) {
- er = emulate_instruction(vcpu, EMULTYPE_TRAP_UD);
- if (er == EMULATE_USER_EXIT)
- return 0;
- if (er != EMULATE_DONE)
- kvm_queue_exception(vcpu, UD_VECTOR);
- return 1;
- }
+ if (is_invalid_opcode(intr_info))
+ return handle_ud(vcpu);

error_code = 0;
if (intr_info & INTR_INFO_DELIVER_CODE_MASK)


The testcase:

#include <stdio.h>
#include <string.h>

#define HYPERVISOR_INFO 0x40000000

#define CPUID(idx, eax, ebx, ecx, edx)\
asm volatile (\
"test %1,%1;jz 1f; ud2a; .ascii \"kvm\"; 1: cpuid" \
:"=b" (*ebx), "=a" (*eax),"=c" (*ecx), "=d" (*edx)\
:"0"(idx) );

void main()
{
unsigned int eax,ebx,ecx,edx;
char string[13];

CPUID(HYPERVISOR_INFO, &eax, &ebx, &ecx, &edx);
*(unsigned int *)(string+0) = ebx;
*(unsigned int *)(string+4) = ecx;
*(unsigned int *)(string+8) = edx;

string[12] = 0;
if (strncmp(string, "KVMKVMKVM\0\0\0",12) == 0) {
printf("kvm guest\n");
} else
printf("bare hardware\n");

}

Regards,
Wanpeng Li

2018-03-26 12:29:10

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: X86: Fix the decoding of segment overrides in 64bit mode

On 26/03/2018 14:25, Wanpeng Li wrote:
> 2018-03-23 23:04 GMT+08:00 Paolo Bonzini <[email protected]>:
>> On 23/03/2018 15:27, Wanpeng Li wrote:
>>> 2018-03-22 21:53 GMT+08:00 Andrew Cooper <[email protected]>:
>>>> On 22/03/18 13:39, Wanpeng Li wrote:
>>>>> 2018-03-22 20:38 GMT+08:00 Paolo Bonzini <[email protected]>:
>>>>>> On 22/03/2018 12:04, Andrew Cooper wrote:
>>>>>>> We've got a Force Emulation Prefix (ud2a; .ascii "xen") for doing
>>>>>>> magic. Originally, this was used for PV guests to explicitly request an
>>>>>>> emulated CPUID, but I extended it to HVM guests for "emulate the next
>>>>>>> instruction", after we had some guest user => guest kernel privilege
>>>>>>> escalations because of incorrect emulation.
>>>>>> Wanpeng, why don't you add it behind a new kvm module parameter? :)
>>>>> Great point! I will have a try. Thanks Paolo and Andrew. :)
>>>>
>>>> Using the force emulation prefix requires intercepting #UD, which is in
>>>> general a BadThing(tm) for security. Therefore, we have a build time
>>>
>>> Yeah, however kvm intercepts and emulates #UD by default, should we
>>> add a new kvm module parameter to enable it and disable by default?
>>
>> No, the module parameter should only be about the force-emulation prefix.
>
> How about something like this? (Add EmulateOnUD to cpuid, the testcase
> will use it)

I think you don't need either EmulateOnUD or EMULTYPE_TRAP_UD (the
latter only when fep=1 of course). Otherwise yes.

Paolo

>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index dd88158..80da5c6 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -4772,7 +4772,7 @@ static const struct opcode twobyte_table[256] = {
> X16(D(ByteOp | DstMem | SrcNone | ModRM| Mov)),
> /* 0xA0 - 0xA7 */
> I(Stack | Src2FS, em_push_sreg), I(Stack | Src2FS, em_pop_sreg),
> - II(ImplicitOps, em_cpuid, cpuid),
> + II(EmulateOnUD | ImplicitOps, em_cpuid, cpuid),
> F(DstMem | SrcReg | ModRM | BitOp | NoWrite, em_bt),
> F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shld),
> F(DstMem | SrcReg | Src2CL | ModRM, em_shld), N, N,
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 9bc05f5..1825b45 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -108,6 +108,9 @@ module_param_named(enable_shadow_vmcs,
> enable_shadow_vmcs, bool, S_IRUGO);
> static bool __read_mostly nested = 0;
> module_param(nested, bool, S_IRUGO);
>
> +static bool __read_mostly fep = 0;
> +module_param(fep, bool, S_IRUGO);
> +
> static u64 __read_mostly host_xss;
>
> static bool __read_mostly enable_pml = 1;
> @@ -6215,6 +6218,27 @@ static int handle_machine_check(struct kvm_vcpu *vcpu)
> return 1;
> }
>
> +static int handle_ud(struct kvm_vcpu *vcpu)
> +{
> + enum emulation_result er;
> +
> + if (fep) {
> + char sig[5]; /* ud2; .ascii "kvm" */
> + struct x86_exception e;
> +
> + kvm_read_guest_virt(&vcpu->arch.emulate_ctxt,
> + kvm_get_linear_rip(vcpu), sig, sizeof(sig), &e);
> + if (memcmp(sig, "\xf\xbkvm", sizeof(sig)) == 0)
> + kvm_rip_write(vcpu, kvm_rip_read(vcpu) + sizeof(sig));
> + }
> + er = emulate_instruction(vcpu, EMULTYPE_TRAP_UD);
> + if (er == EMULATE_USER_EXIT)
> + return 0;
> + if (er != EMULATE_DONE)
> + kvm_queue_exception(vcpu, UD_VECTOR);
> + return 1;
> +}
> +
> static int handle_exception(struct kvm_vcpu *vcpu)
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -6233,14 +6257,8 @@ static int handle_exception(struct kvm_vcpu *vcpu)
> if (is_nmi(intr_info))
> return 1; /* already handled by vmx_vcpu_run() */
>
> - if (is_invalid_opcode(intr_info)) {
> - er = emulate_instruction(vcpu, EMULTYPE_TRAP_UD);
> - if (er == EMULATE_USER_EXIT)
> - return 0;
> - if (er != EMULATE_DONE)
> - kvm_queue_exception(vcpu, UD_VECTOR);
> - return 1;
> - }
> + if (is_invalid_opcode(intr_info))
> + return handle_ud(vcpu);
>
> error_code = 0;
> if (intr_info & INTR_INFO_DELIVER_CODE_MASK)
>
>
> The testcase:
>
> #include <stdio.h>
> #include <string.h>
>
> #define HYPERVISOR_INFO 0x40000000
>
> #define CPUID(idx, eax, ebx, ecx, edx)\
> asm volatile (\
> "test %1,%1;jz 1f; ud2a; .ascii \"kvm\"; 1: cpuid" \
> :"=b" (*ebx), "=a" (*eax),"=c" (*ecx), "=d" (*edx)\
> :"0"(idx) );
>
> void main()
> {
> unsigned int eax,ebx,ecx,edx;
> char string[13];
>
> CPUID(HYPERVISOR_INFO, &eax, &ebx, &ecx, &edx);
> *(unsigned int *)(string+0) = ebx;
> *(unsigned int *)(string+4) = ecx;
> *(unsigned int *)(string+8) = edx;
>
> string[12] = 0;
> if (strncmp(string, "KVMKVMKVM\0\0\0",12) == 0) {
> printf("kvm guest\n");
> } else
> printf("bare hardware\n");
>
> }
>
> Regards,
> Wanpeng Li
>