2010-06-25 12:09:17

by Xiao Guangrong

[permalink] [raw]
Subject: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping

While we sync the unsync sp, we may mapping the spte writable, it's
dangerous, if one unsync sp's mapping gfn is another unsync page's gfn.

For example:
have two unsync pages SP1, SP2 and:

SP1.pte[0] = P
SP2.gfn's pfn = P
[SP1.pte[0] = SP2.gfn's pfn]

First, we unsync SP2, it will write protect for SP2.gfn since
SP1.pte[0] is mapping to this page, it will mark read only.

Then, we unsync SP1, SP1.pte[0] may mark to writable.

Now, we will write SP2.gfn by SP1.pte[0] mapping

This bug will corrupt guest's page table, fixed by mark read-only mapping
if the mapped gfn has shadow page

Signed-off-by: Xiao Guangrong <[email protected]>
---
arch/x86/kvm/mmu.c | 14 ++++----------
1 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 045a0f9..556a798 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1810,11 +1810,14 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
bool need_unsync = false;

for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn, node) {
+ if (!can_unsync)
+ return 1;
+
if (s->role.level != PT_PAGE_TABLE_LEVEL)
return 1;

if (!need_unsync && !s->unsync) {
- if (!can_unsync || !oos_shadow)
+ if (!oos_shadow)
return 1;
need_unsync = true;
}
@@ -1877,15 +1880,6 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
if (!tdp_enabled && !(pte_access & ACC_WRITE_MASK))
spte &= ~PT_USER_MASK;

- /*
- * Optimization: for pte sync, if spte was writable the hash
- * lookup is unnecessary (and expensive). Write protection
- * is responsibility of mmu_get_page / kvm_sync_page.
- * Same reasoning can be applied to dirty page accounting.
- */
- if (!can_unsync && is_writable_pte(*sptep))
- goto set_pte;
-
if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
pgprintk("%s: found shadow page for %lx, marking ro\n",
__func__, gfn);
--
1.6.1.2


2010-06-27 08:03:10

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping



Xiao Guangrong wrote:

>
> - /*
> - * Optimization: for pte sync, if spte was writable the hash
> - * lookup is unnecessary (and expensive). Write protection
> - * is responsibility of mmu_get_page / kvm_sync_page.
> - * Same reasoning can be applied to dirty page accounting.
> - */
> - if (!can_unsync && is_writable_pte(*sptep))
> - goto set_pte;
> -

Sorry, this optimization not broken anything, just my mistake, please review
this.

Subject: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping

While we sync the unsync sp, we may mapping the spte writable, it's
dangerous, if one unsync sp's mapping gfn is another unsync page's gfn.

For example:
have two unsync pages SP1, SP2 and:

SP1.pte[0] = P
SP2.gfn's pfn = P
[SP1.pte[0] = SP2.gfn's pfn]

First, we unsync SP2, it will write protect for SP2.gfn since
SP1.pte[0] is mapping to this page, it will mark read only.

Then, we unsync SP1, SP1.pte[0] may mark to writable.

Now, we will write SP2.gfn by SP1.pte[0] mapping

This bug will corrupt guest's page table, fixed by mark read-only mapping
if the mapped gfn has shadow page

Signed-off-by: Xiao Guangrong <[email protected]>
---
arch/x86/kvm/mmu.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 045a0f9..24290f8 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1810,11 +1810,14 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
bool need_unsync = false;

for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn, node) {
+ if (!can_unsync)
+ return 1;
+
if (s->role.level != PT_PAGE_TABLE_LEVEL)
return 1;

if (!need_unsync && !s->unsync) {
- if (!can_unsync || !oos_shadow)
+ if (!oos_shadow)
return 1;
need_unsync = true;
}
--
1.6.1.2

2010-06-28 09:18:12

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping

On 06/27/2010 10:59 AM, Xiao Guangrong wrote:
>
> Xiao Guangrong wrote:
>
>
>>
>> - /*
>> - * Optimization: for pte sync, if spte was writable the hash
>> - * lookup is unnecessary (and expensive). Write protection
>> - * is responsibility of mmu_get_page / kvm_sync_page.
>> - * Same reasoning can be applied to dirty page accounting.
>> - */
>> - if (!can_unsync&& is_writable_pte(*sptep))
>> - goto set_pte;
>> -
>>
> Sorry, this optimization not broken anything, just my mistake, please review
> this.
>
> Subject: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping
>
> While we sync the unsync sp, we may mapping the spte writable, it's
> dangerous, if one unsync sp's mapping gfn is another unsync page's gfn.
>
> For example:
> have two unsync pages SP1, SP2 and:
>
> SP1.pte[0] = P
> SP2.gfn's pfn = P
> [SP1.pte[0] = SP2.gfn's pfn]
>
> First, we unsync SP2, it will write protect for SP2.gfn since
>

Do you mean we sync SP2 here?

> SP1.pte[0] is mapping to this page, it will mark read only.
>
> Then, we unsync SP1, SP1.pte[0] may mark to writable.
>

How can unsyncing SP1 change SP1.pte[0]?

When we unsync SP2 by a fault through SP1.pte[0], that can cause
SP1.pte[0] to become writable. But unsyncing SP1 shouldn't have an
effect on its sptes.

> Now, we will write SP2.gfn by SP1.pte[0] mapping
>
> This bug will corrupt guest's page table, fixed by mark read-only mapping
> if the mapped gfn has shadow page
>
> Signed-off-by: Xiao Guangrong<[email protected]>
> ---
> arch/x86/kvm/mmu.c | 5 ++++-
> 1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 045a0f9..24290f8 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1810,11 +1810,14 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn,
> bool need_unsync = false;
>
> for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn, node) {
> + if (!can_unsync)
> + return 1;
> +
>

What if the page is already unsync? We don't need write protection in
this case.

> if (s->role.level != PT_PAGE_TABLE_LEVEL)
> return 1;
>
> if (!need_unsync&& !s->unsync) {
> - if (!can_unsync || !oos_shadow)
> + if (!oos_shadow)
> return 1;
> need_unsync = true;
> }
>

How can this change anything? On the first pass, need_unsync = false,
so we will check can_unsync and return.

--
error compiling committee.c: too many arguments to function

2010-06-28 09:44:30

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping



Avi Kivity wrote:

>>
>> for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn, node) {
>> + if (!can_unsync)
>> + return 1;
>> +
>>
>
> What if the page is already unsync? We don't need write protection in
> this case.

Avi,

The reason is when we sync children sps, we write-protected for all sps first,
list relevant code:

| static void mmu_sync_children(...)
| {
| ......
| for_each_sp(pages, sp, parents, i)
| protected |= rmap_write_protect(vcpu->kvm, sp->gfn); <==== A
|
| if (protected)
| kvm_flush_remote_tlbs(vcpu->kvm);
|
| for_each_sp(pages, sp, parents, i) {
| kvm_sync_page(vcpu, sp, &invalid_list); <==== B
| mmu_pages_clear_parents(&parents);
| }
| ......
|}

For example:

SP1.pte[0] = P
SP2.gfn's pfn = P
[SP1.pte[0] = SP2.gfn's pfn]

At A point, SP1.gfn and SP2.gfn are write-protected.

At B point, if sync SP1 first, while it's synced. it will detect SP1.pte[0].gfn only has one unsync-sp,
that is SP2, so it will mapping it writable, then we sync SP2, we will set SP2 to sync page.

The final result is: SP2 is the sync page but SP2.gfn is writable.
Note: we not do write-protected in kvm_sync_page() anymore after commit: 95b4b26cfc

2010-06-28 11:41:39

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping

On 06/28/2010 12:40 PM, Xiao Guangrong wrote:
>
> Avi Kivity wrote:
>
>
>>> for_each_gfn_indirect_valid_sp(vcpu->kvm, s, gfn, node) {
>>> + if (!can_unsync)
>>> + return 1;
>>> +
>>>
>>>
>> What if the page is already unsync? We don't need write protection in
>> this case.
>>
> Avi,
>
> The reason is when we sync children sps, we write-protected for all sps first,
> list relevant code:
>
> | static void mmu_sync_children(...)
> | {
> | ......
> | for_each_sp(pages, sp, parents, i)
> | protected |= rmap_write_protect(vcpu->kvm, sp->gfn);<==== A
> |
> | if (protected)
> | kvm_flush_remote_tlbs(vcpu->kvm);
> |
> | for_each_sp(pages, sp, parents, i) {
> | kvm_sync_page(vcpu, sp,&invalid_list);<==== B
> | mmu_pages_clear_parents(&parents);
> | }
> | ......
> |}
>
> For example:
>
> SP1.pte[0] = P
> SP2.gfn's pfn = P
> [SP1.pte[0] = SP2.gfn's pfn]
>
> At A point, SP1.gfn and SP2.gfn are write-protected.
>
> At B point, if sync SP1 first, while it's synced. it will detect SP1.pte[0].gfn only has one unsync-sp,
> that is SP2, so it will mapping it writable, then we sync SP2, we will set SP2 to sync page.
>
> The final result is: SP2 is the sync page but SP2.gfn is writable.
>


I think I see. So, after A, the pages are write protected, but are
still marked as unsync. In B, we're testing SP2->unsync, which we plan
to sync soon, but haven't yet. So the test for s->unsync is incorrect.

So the patch is right. Thanks for the explanation. Please update the
changelog to note that sp->unsync is not reliable during resync, this is
tricky stuff.

--
error compiling committee.c: too many arguments to function

2010-06-29 01:23:14

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH v2 1/10] KVM: MMU: fix writable sync sp mapping



Avi Kivity wrote:

>
>
> I think I see. So, after A, the pages are write protected, but are
> still marked as unsync. In B, we're testing SP2->unsync, which we plan
> to sync soon, but haven't yet. So the test for s->unsync is incorrect.
>

Right.

> So the patch is right. Thanks for the explanation. Please update the
> changelog to note that sp->unsync is not reliable during resync, this is
> tricky stuff.
>

OK, will do it in the next version.