LinuxLists.cc - [PATCH] KVM: MMU: lazily drop large spte

2012-11-05 09:59:36

Subject: [PATCH] KVM: MMU: lazily drop large spte

Do not drop large spte until it can be insteaded by small pages so that
the guest can happliy read memory through it

The idea is from Avi:
| As I mentioned before, write-protecting a large spte is a good idea,
| since it moves some work from protect-time to fault-time, so it reduces
| jitter. This removes the need for the return value.

Signed-off-by: Xiao Guangrong <[email protected]>
---
arch/x86/kvm/mmu.c | 34 +++++++++-------------------------
1 files changed, 9 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b875a9e..1d8869c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1105,7 +1105,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep)

/*
* Write-protect on the specified @sptep, @pt_protect indicates whether
- * spte writ-protection is caused by protecting shadow page table.
+ * spte write-protection is caused by protecting shadow page table.
* @flush indicates whether tlb need be flushed.
*
* Note: write protection is difference between drity logging and spte
@@ -1114,31 +1114,23 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep)
* its dirty bitmap is properly set.
* - for spte protection, the spte can be writable only after unsync-ing
* shadow page.
- *
- * Return true if the spte is dropped.
*/
-static bool
+static void
spte_write_protect(struct kvm *kvm, u64 *sptep, bool *flush, bool pt_protect)
{
u64 spte = *sptep;

if (!is_writable_pte(spte) &&
!(pt_protect && spte_is_locklessly_modifiable(spte)))
- return false;
+ return;

rmap_printk("rmap_write_protect: spte %p %llx\n", sptep, *sptep);

- if (__drop_large_spte(kvm, sptep)) {
- *flush |= true;
- return true;
- }
-
if (pt_protect)
spte &= ~SPTE_MMU_WRITEABLE;
spte = spte & ~PT_WRITABLE_MASK;

*flush |= mmu_spte_update(sptep, spte);
- return false;
}

static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
@@ -1150,11 +1142,8 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,

for (sptep = rmap_get_first(*rmapp, &iter); sptep;) {
BUG_ON(!(*sptep & PT_PRESENT_MASK));
- if (spte_write_protect(kvm, sptep, &flush, pt_protect)) {
- sptep = rmap_get_first(*rmapp, &iter);
- continue;
- }

+ spte_write_protect(kvm, sptep, &flush, pt_protect);
sptep = rmap_get_next(&iter);
}

@@ -2381,14 +2370,6 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
if ((pte_access & ACC_WRITE_MASK)
|| (!vcpu->arch.mmu.direct_map && write_fault
&& !is_write_protection(vcpu) && !user_fault)) {
-
- if (level > PT_PAGE_TABLE_LEVEL &&
- has_wrprotected_page(vcpu->kvm, gfn, level)) {
- ret = 1;
- drop_spte(vcpu->kvm, sptep);
- goto done;
- }
-
spte |= PT_WRITABLE_MASK | SPTE_MMU_WRITEABLE;

if (!vcpu->arch.mmu.direct_map
@@ -2413,7 +2394,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
if (!can_unsync && is_writable_pte(*sptep))
goto set_pte;

- if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
+ if ((level > PT_PAGE_TABLE_LEVEL &&
+ has_wrprotected_page(vcpu->kvm, gfn, level)) ||
+ mmu_need_write_protect(vcpu, gfn, can_unsync)) {
pgprintk("%s: found shadow page for %llx, marking ro\n",
__func__, gfn);
ret = 1;
@@ -2428,7 +2411,6 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
set_pte:
if (mmu_spte_update(sptep, spte))
kvm_flush_remote_tlbs(vcpu->kvm);
-done:
return ret;
}

@@ -2635,6 +2617,8 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
break;
}

+ drop_large_spte(vcpu, iterator.sptep);
+
if (!is_shadow_present_pte(*iterator.sptep)) {
u64 base_addr = iterator.addr;

--
1.7.7.6

2012-11-12 23:10:44

by Marcelo Tosatti

[permalink] [raw]

Subject: Re: [PATCH] KVM: MMU: lazily drop large spte

On Mon, Nov 05, 2012 at 05:59:26PM +0800, Xiao Guangrong wrote:
> Do not drop large spte until it can be insteaded by small pages so that
> the guest can happliy read memory through it
>
> The idea is from Avi:
> | As I mentioned before, write-protecting a large spte is a good idea,
> | since it moves some work from protect-time to fault-time, so it reduces
> | jitter. This removes the need for the return value.
>
> Signed-off-by: Xiao Guangrong <[email protected]>
> ---
> arch/x86/kvm/mmu.c | 34 +++++++++-------------------------
> 1 files changed, 9 insertions(+), 25 deletions(-)

Its likely that other 4k pages are mapped read-write in the 2mb range
covered by a read-only 2mb map. Therefore its not entirely useful to
map read-only.

Can you measure an improvement with this change?

2012-11-13 08:26:26

On 11/18/2012 11:00 AM, Marcelo Tosatti wrote:
map gfn 4? See corrected step 7 above.
>>
>> Ah, this is a real bug, and unfortunately, it exists in current
>> code. I will make a separate patchset to fix it. Thank you, Marcelo!
>
> Is it? Hum..
>
> Anyway, it would be great if you can write a testcase (should be similar
> in size to rmap_chain).

Marcelo, is this patch acceptable?

2012-11-28 11:51:17

by Marcelo Tosatti

[permalink] [raw]

Subject: Re: [PATCH] KVM: MMU: lazily drop large spte

On Wed, Nov 28, 2012 at 01:27:38PM +0800, Xiao Guangrong wrote:
> On 11/18/2012 11:00 AM, Marcelo Tosatti wrote:
> map gfn 4? See corrected step 7 above.
> >>
> >> Ah, this is a real bug, and unfortunately, it exists in current
> >> code. I will make a separate patchset to fix it. Thank you, Marcelo!
> >
> > Is it? Hum..
> >
> > Anyway, it would be great if you can write a testcase (should be similar
> > in size to rmap_chain).
>
> Marcelo, is this patch acceptable?

Yes, can we get reexecute_instruction fix first? (which should then be
able to handle the case where a large read-only spte is created).

I'll merge the testcase later today.