LinuxLists.cc - [PATCH] KVM: dirty all pages in kvm_write_guest

2015-04-07 20:35:35

Subject: [PATCH] KVM: dirty all pages in kvm_write_guest_cached()

We dirtied only one page because writes originally couldn't span more.
Use improved syntax for '>> PAGE_SHIFT' while at it.

Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
Signed-off-by: Radim Krčmář <[email protected]>
---
The function handles cross memslot writes in a different path.

I think we should dirty pages after partial writes too (r < len),
but it probably won't happen and I already started refactoring :)

virt/kvm/kvm_main.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aadef264bed1..863df9dcab6f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1665,6 +1665,7 @@ int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
{
struct kvm_memslots *slots = kvm_memslots(kvm);
int r;
+ gfn_t gfn;

BUG_ON(len > ghc->len);

@@ -1680,7 +1681,10 @@ int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
r = __copy_to_user((void __user *)ghc->hva, data, len);
if (r)
return -EFAULT;
- mark_page_dirty_in_slot(kvm, ghc->memslot, ghc->gpa >> PAGE_SHIFT);
+
+ for (gfn = gpa_to_gfn(ghc->gpa);
+ gfn <= gpa_to_gfn(ghc->gpa + len - 1); gfn++)
+ mark_page_dirty_in_slot(kvm, ghc->memslot, gfn);

return 0;
}
--
2.3.4

2015-04-08 08:50:23

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH] KVM: dirty all pages in kvm_write_guest_cached()

On 07/04/2015 22:34, Radim Krčmář wrote:
> We dirtied only one page because writes originally couldn't span more.
> Use improved syntax for '>> PAGE_SHIFT' while at it.
>
> Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
> Signed-off-by: Radim Krčmář <[email protected]>

Cross-page reads and writes should never get here; they have
ghc->memslot set to NULL and go through the slow path in kvm_write_guest.

What am I missing?

Paolo

> ---
> The function handles cross memslot writes in a different path.
>
> I think we should dirty pages after partial writes too (r < len),
> but it probably won't happen and I already started refactoring :)
>
> virt/kvm/kvm_main.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index aadef264bed1..863df9dcab6f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1665,6 +1665,7 @@ int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
> {
> struct kvm_memslots *slots = kvm_memslots(kvm);
> int r;
> + gfn_t gfn;
>
> BUG_ON(len > ghc->len);
>
> @@ -1680,7 +1681,10 @@ int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
> r = __copy_to_user((void __user *)ghc->hva, data, len);
> if (r)
> return -EFAULT;
> - mark_page_dirty_in_slot(kvm, ghc->memslot, ghc->gpa >> PAGE_SHIFT);
> +
> + for (gfn = gpa_to_gfn(ghc->gpa);
> + gfn <= gpa_to_gfn(ghc->gpa + len - 1); gfn++)
> + mark_page_dirty_in_slot(kvm, ghc->memslot, gfn);
>
> return 0;
> }
>

2015-04-08 09:26:20

by Radim Krčmář

[permalink] [raw]

Subject: Re: [PATCH] KVM: dirty all pages in kvm_write_guest_cached()

2015-04-08 10:49+0200, Paolo Bonzini:
> On 07/04/2015 22:34, Radim Krčmář wrote:
> > We dirtied only one page because writes originally couldn't span more.
> > Use improved syntax for '>> PAGE_SHIFT' while at it.
> >
> > Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
> > Signed-off-by: Radim Krčmář <[email protected]>
>
> Cross-page reads and writes should never get here; they have
> ghc->memslot set to NULL and go through the slow path in kvm_write_guest.

Only cross-memslot writes have NULL memslot.

> What am I missing?

kvm_gfn_to_hva_cache_init() queries how many pages are remaining in the
memslot and it compares it with the amount of needed pages.
If the write will fit in memslot, it will be done without
kvm_write_guest, regardless of the amount of written pages.

The relevant code path in kvm_gfn_to_hva_cache_init():
gfn_t nr_pages_needed = end_gfn - start_gfn + 1;
ghc->memslot = gfn_to_memslot(kvm, start_gfn);
ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, &nr_pages_avail);
if (!kvm_is_error_hva(ghc->hva) && nr_pages_avail >= nr_pages_needed)
ghc->hva += offset;
return 0;

2015-04-08 10:43:21

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH] KVM: dirty all pages in kvm_write_guest_cached()

On 08/04/2015 11:26, Radim Krčmář wrote:
> 2015-04-08 10:49+0200, Paolo Bonzini:
>> On 07/04/2015 22:34, Radim Krčmář wrote:
>>> We dirtied only one page because writes originally couldn't span more.
>>> Use improved syntax for '>> PAGE_SHIFT' while at it.
>>>
>>> Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
>>> Signed-off-by: Radim Krčmář <[email protected]>
>>
>> Cross-page reads and writes should never get here; they have
>> ghc->memslot set to NULL and go through the slow path in kvm_write_guest.
>
> Only cross-memslot writes have NULL memslot.

The power of wrong comments...

Considering how kvm_gfn_to_hva_cache_init is used (one 1-byte field, two
4-byte fields, one 28-bytes struct that is 32-bytes aligned, one
32-bytes field that is in practice cacheline-aligned), I wonder if we
should just use ghc->memslot = NULL for cross page writes. This would
bypass the bug you are fixing here, and avoid worries about partial writes.

Paolo

2015-04-08 12:16:55

by Radim Krčmář

[permalink] [raw]

Subject: [PATCH] KVM: use slowpath for cross page cached accesses

2015-04-08 12:43+0200, Paolo Bonzini:
> On 08/04/2015 11:26, Radim Krčmář wrote:
>> 2015-04-08 10:49+0200, Paolo Bonzini:
>>> On 07/04/2015 22:34, Radim Krčmář wrote:
>>>> We dirtied only one page because writes originally couldn't span more.
>>>> Use improved syntax for '>> PAGE_SHIFT' while at it.
>>>>
>>>> Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
>>>> Signed-off-by: Radim Krčmář <[email protected]>
>>>
>>> Cross-page reads and writes should never get here; they have
>>> ghc->memslot set to NULL and go through the slow path in kvm_write_guest.
>>
>> Only cross-memslot writes have NULL memslot.
>
> The power of wrong comments...
>
> Considering how kvm_gfn_to_hva_cache_init is used (one 1-byte field, two
> 4-byte fields, one 28-bytes struct that is 32-bytes aligned, one
> 32-bytes field that is in practice cacheline-aligned), I wonder if we
> should just use ghc->memslot = NULL for cross page writes. This would
> bypass the bug you are fixing here, and avoid worries about partial writes.

Good idea, and it could make those comments right :)
(Though in general, I prefer less constraints on APIs ...)

Partial writes would be a pain; copy_to_user API does not define which
bytes were not written. I think the write can't fail mid-page, which
makes our implementation ok, but I still worry a bit about it.

Anyway, here's the patch:

---8<---
kvm_write_guest_cached() does not mark all written pages as dirty and
code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
with cross page accesses. Fix all the easy way.

The check is '<= 1' to have the same result for 'len = 0' cache anywhere
in the page. (nr_pages_needed is 0 on page boundary.)

Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
Signed-off-by: Radim Krčmář <[email protected]>
---
virt/kvm/kvm_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index aadef264bed1..f3dc641f9640 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1637,8 +1637,8 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
ghc->generation = slots->generation;
ghc->len = len;
ghc->memslot = gfn_to_memslot(kvm, start_gfn);
- ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, &nr_pages_avail);
- if (!kvm_is_error_hva(ghc->hva) && nr_pages_avail >= nr_pages_needed) {
+ ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, NULL);
+ if (!kvm_is_error_hva(ghc->hva) && nr_pages_needed <= 1) {
ghc->hva += offset;
} else {
/*

2015-04-08 12:23:14

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH] KVM: use slowpath for cross page cached accesses

On 08/04/2015 14:16, Radim Krčmář wrote:
> 2015-04-08 12:43+0200, Paolo Bonzini:
>> On 08/04/2015 11:26, Radim Krčmář wrote:
>>> Only cross-memslot writes have NULL memslot.
>>
>> The power of wrong comments...
>>
>> Considering how kvm_gfn_to_hva_cache_init is used (one 1-byte field, two
>> 4-byte fields, one 28-bytes struct that is 32-bytes aligned, one
>> 32-bytes field that is in practice cacheline-aligned), I wonder if we
>> should just use ghc->memslot = NULL for cross page writes. This would
>> bypass the bug you are fixing here, and avoid worries about partial writes.
>
> Good idea, and it could make those comments right :)
> (Though in general, I prefer less constraints on APIs ...)

It doesn't put constraints, it still handles cross page writes right
(just slower). copy_to_user in some sense is the API that constrains us
to do this.

> Partial writes would be a pain; copy_to_user API does not define which
> bytes were not written. I think the write can't fail mid-page, which
> makes our implementation ok

No, writes can't fail mid-page (I guess in atomic context it's
theoretically possible, but we're equipped to handle the failure in that
case).

Patch applied, thanks!

Paolo

2015-04-09 00:36:24

by Wanpeng Li

[permalink] [raw]

Subject: Re: [PATCH] KVM: use slowpath for cross page cached accesses

On Wed, Apr 08, 2015 at 02:16:48PM +0200, Radim Krčmář wrote:
>2015-04-08 12:43+0200, Paolo Bonzini:
>> On 08/04/2015 11:26, Radim Krčmář wrote:
>>> 2015-04-08 10:49+0200, Paolo Bonzini:
>>>> On 07/04/2015 22:34, Radim Krčmář wrote:
>>>>> We dirtied only one page because writes originally couldn't span more.
>>>>> Use improved syntax for '>> PAGE_SHIFT' while at it.
>>>>>
>>>>> Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
>>>>> Signed-off-by: Radim Krčmář <[email protected]>
>>>>
>>>> Cross-page reads and writes should never get here; they have
>>>> ghc->memslot set to NULL and go through the slow path in kvm_write_guest.
>>>
>>> Only cross-memslot writes have NULL memslot.
>>
>> The power of wrong comments...
>>
>> Considering how kvm_gfn_to_hva_cache_init is used (one 1-byte field, two
>> 4-byte fields, one 28-bytes struct that is 32-bytes aligned, one
>> 32-bytes field that is in practice cacheline-aligned), I wonder if we
>> should just use ghc->memslot = NULL for cross page writes. This would
>> bypass the bug you are fixing here, and avoid worries about partial writes.
>
>Good idea, and it could make those comments right :)
>(Though in general, I prefer less constraints on APIs ...)
>
>Partial writes would be a pain; copy_to_user API does not define which
>bytes were not written. I think the write can't fail mid-page, which
>makes our implementation ok, but I still worry a bit about it.
>
>Anyway, here's the patch:
>
>---8<---
>kvm_write_guest_cached() does not mark all written pages as dirty and
>code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
>with cross page accesses. Fix all the easy way.
>
>The check is '<= 1' to have the same result for 'len = 0' cache anywhere
>in the page. (nr_pages_needed is 0 on page boundary.)
>
>Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
>Signed-off-by: Radim Krčmář <[email protected]>

Reviewed-by: Wanpeng Li <[email protected]>

>---
> virt/kvm/kvm_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>index aadef264bed1..f3dc641f9640 100644
>--- a/virt/kvm/kvm_main.c
>+++ b/virt/kvm/kvm_main.c
>@@ -1637,8 +1637,8 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
> ghc->generation = slots->generation;
> ghc->len = len;
> ghc->memslot = gfn_to_memslot(kvm, start_gfn);
>- ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, &nr_pages_avail);
>- if (!kvm_is_error_hva(ghc->hva) && nr_pages_avail >= nr_pages_needed) {
>+ ghc->hva = gfn_to_hva_many(ghc->memslot, start_gfn, NULL);
>+ if (!kvm_is_error_hva(ghc->hva) && nr_pages_needed <= 1) {
> ghc->hva += offset;
> } else {
> /*
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/