KVM does not handle the reserved bits of x86 page tables correctly:
In PAE, bits 5:8 are reserved in the PDPTE.
In IA-32e, bit 8 is not reserved.
Signed-off-by: Nadav Amit <[email protected]>
---
arch/x86/kvm/mmu.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f5704d9..3993976 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3538,7 +3538,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
case PT32E_ROOT_LEVEL:
context->rsvd_bits_mask[0][2] =
rsvd_bits(maxphyaddr, 63) |
- rsvd_bits(7, 8) | rsvd_bits(1, 2); /* PDPTE */
+ rsvd_bits(5, 8) | rsvd_bits(1, 2); /* PDPTE */
context->rsvd_bits_mask[0][1] = exb_bit_rsvd |
rsvd_bits(maxphyaddr, 62); /* PDE */
context->rsvd_bits_mask[0][0] = exb_bit_rsvd |
@@ -3550,9 +3550,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
break;
case PT64_ROOT_LEVEL:
context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
- rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
+ rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
- rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
+ rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
context->rsvd_bits_mask[0][1] = exb_bit_rsvd |
rsvd_bits(maxphyaddr, 51);
context->rsvd_bits_mask[0][0] = exb_bit_rsvd |
--
1.7.10.4
On Fri, Apr 04, 2014 at 06:31:04AM +0300, Nadav Amit wrote:
> KVM does not handle the reserved bits of x86 page tables correctly:
> In PAE, bits 5:8 are reserved in the PDPTE.
> In IA-32e, bit 8 is not reserved.
>
> Signed-off-by: Nadav Amit <[email protected]>
> ---
> arch/x86/kvm/mmu.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index f5704d9..3993976 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3538,7 +3538,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
> case PT32E_ROOT_LEVEL:
> context->rsvd_bits_mask[0][2] =
> rsvd_bits(maxphyaddr, 63) |
> - rsvd_bits(7, 8) | rsvd_bits(1, 2); /* PDPTE */
> + rsvd_bits(5, 8) | rsvd_bits(1, 2); /* PDPTE */
> context->rsvd_bits_mask[0][1] = exb_bit_rsvd |
> rsvd_bits(maxphyaddr, 62); /* PDE */
> context->rsvd_bits_mask[0][0] = exb_bit_rsvd |
> @@ -3550,9 +3550,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
> break;
> case PT64_ROOT_LEVEL:
> context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
> context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
Bit 7 is not reserved either, for the PDPTE (its PageSize bit).
On 04/16/2014 12:03 PM, Marcelo Tosatti wrote:
>> @@ -3550,9 +3550,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
>> break;
>> case PT64_ROOT_LEVEL:
>> context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
>> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
>> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
>> context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
>> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
>> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
>
> Bit 7 is not reserved either, for the PDPTE (its PageSize bit).
>
In long mode (IA-32e), bit 7 is definitely reserved.
-hpa
On Wed, Apr 16, 2014 at 02:17:08PM -0700, H. Peter Anvin wrote:
> On 04/16/2014 12:03 PM, Marcelo Tosatti wrote:
> >> @@ -3550,9 +3550,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
> >> break;
> >> case PT64_ROOT_LEVEL:
> >> context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
> >> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
> >> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
> >> context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
> >> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
> >> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
> >
> > Bit 7 is not reserved either, for the PDPTE (its PageSize bit).
> >
>
> In long mode (IA-32e), bit 7 is definitely reserved.
>
> -hpa
There is a separate reserved mask for PS=1, nevermind.
Il 17/04/2014 00:04, Marcelo Tosatti ha scritto:
>>>> > >> @@ -3550,9 +3550,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
>>>> > >> break;
>>>> > >> case PT64_ROOT_LEVEL:
>>>> > >> context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
>>>> > >> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
>>>> > >> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
>>>> > >> context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
>>>> > >> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
>>>> > >> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
>>> > >
>>> > > Bit 7 is not reserved either, for the PDPTE (its PageSize bit).
>>> > >
>> >
>> > In long mode (IA-32e), bit 7 is definitely reserved.
It's always reserved for PML4E (rsvd_bits_mask[x][3]), while for PDPTEs
it is not reserved if you have 1GB pages.
> There is a separate reserved mask for PS=1, nevermind.
>
Yeah, but the situation for IA32e rsvd_bits_mask[0][2] is exactly the
same as for PAE rsvd_bits_mask[0][1], and we're not marking the bit as
reserved there.
The right thing to do is to add rsvd_bits(7, 7) to both
rsvd_bits_mask[0][2] and rsvd_bits_mask[1][2], if 1GB pages are not
supported.
As written, the patch has no effect on PDPTEs because
rsvd_bits_mask[0][2] is only accessed if bit 7 is zero.
Nadav, would you mind preparing a follow-up? Also, how did you find
these issues and test the fixes?
Paolo
On Apr 28, 2014, at 1:41 PM, Paolo Bonzini <[email protected]> wrote:
> Il 17/04/2014 00:04, Marcelo Tosatti ha scritto:
>>>>> > >> @@ -3550,9 +3550,9 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
>>>>> > >> break;
>>>>> > >> case PT64_ROOT_LEVEL:
>>>>> > >> context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
>>>>> > >> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
>>>>> > >> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
>>>>> > >> context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
>>>>> > >> - rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 8);
>>>>> > >> + rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
>>>> > >
>>>> > > Bit 7 is not reserved either, for the PDPTE (its PageSize bit).
>>>> > >
>>> >
>>> > In long mode (IA-32e), bit 7 is definitely reserved.
>
> It's always reserved for PML4E (rsvd_bits_mask[x][3]), while for PDPTEs it is not reserved if you have 1GB pages.
>
>> There is a separate reserved mask for PS=1, nevermind.
>>
>
> Yeah, but the situation for IA32e rsvd_bits_mask[0][2] is exactly the same as for PAE rsvd_bits_mask[0][1], and we're not marking the bit as reserved there.
>
> The right thing to do is to add rsvd_bits(7, 7) to both rsvd_bits_mask[0][2] and rsvd_bits_mask[1][2], if 1GB pages are not supported.
>
> As written, the patch has no effect on PDPTEs because rsvd_bits_mask[0][2] is only accessed if bit 7 is zero.
>
> Nadav, would you mind preparing a follow-up? Also, how did you find these issues and test the fixes?
I will create a follow-up as soon as possible. We encountered the issues in a custom testing environment.
The fixes were validated using the failing tests, but they cover additional cases which might not have been tested.
Regards,
Nadav-