2013-06-08 03:15:47

by Xiao Guangrong

[permalink] [raw]
Subject: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

From: Xiao Guangrong <[email protected]>

Currently, memory synchronization is missed in emulator_fix_hypercall,
please see the commit 758ccc89b83
(KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall)

This patch fixes it by introducing kvm_vcpus_hang_on_page_start() and
kvm_vcpus_hang_on_page_end which unmap the patched page from guest
and use kvm_flush_remote_tlbs() as the serializing instruction to
ensure the memory coherence
[ The SDM said that INVEPT, INVVPID and MOV (to control register, with
the exception of MOV CR8) are the serializing instructions. ]

The mmu-lock is held during host patches the page so that it stops vcpus
to fix its further page fault

Signed-off-by: Xiao Guangrong <[email protected]>
---
arch/x86/kvm/mmu.c | 25 +++++++++++++++++++++++++
arch/x86/kvm/mmu.h | 3 +++
arch/x86/kvm/x86.c | 7 +++++++
3 files changed, 35 insertions(+)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7d50a2d..35cd0b6 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4536,6 +4536,31 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4])
}
EXPORT_SYMBOL_GPL(kvm_mmu_get_spte_hierarchy);

+/*
+ * Force vcpu to hang when it is trying to access the specified page.
+ *
+ * kvm_vcpus_hang_on_page_start and kvm_vcpus_hang_on_page_end should
+ * be used in pairs and they are currently used to sync memory access
+ * between vcpus when host cross-modifies the code segment of guest.
+ *
+ * We unmap the page from the guest and do memory synchronization by
+ * kvm_flush_remote_tlbs() under the protection of mmu-lock. If vcpu
+ * accesses the page, it will trigger #PF and be blocked on mmu-lock.
+ */
+void kvm_vcpus_hang_on_page_start(struct kvm *kvm, gfn_t gfn)
+{
+ spin_lock(&kvm->mmu_lock);
+
+ /* kvm_flush_remote_tlbs() can act as serializing instruction. */
+ if (kvm_unmap_hva(kvm, gfn_to_hva(kvm, gfn)))
+ kvm_flush_remote_tlbs(kvm);
+}
+
+void kvm_vcpus_hang_on_page_end(struct kvm *kvm)
+{
+ spin_unlock(&kvm->mmu_lock);
+}
+
void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
{
ASSERT(vcpu);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 5b59c57..35910be 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -115,4 +115,7 @@ static inline bool permission_fault(struct kvm_mmu *mmu, unsigned pte_access,
}

void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm);
+
+void kvm_vcpus_hang_on_page_start(struct kvm *kvm, gfn_t gfn);
+void kvm_vcpus_hang_on_page_end(struct kvm *kvm);
#endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9e4afa7..776bf1a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5528,8 +5528,15 @@ static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
char instruction[3];
unsigned long rip = kvm_rip_read(vcpu);
+ gpa_t gpa;
+
+ gpa = kvm_mmu_gva_to_gpa_fetch(vcpu, rip, NULL);
+ if (gpa == UNMAPPED_GVA)
+ return X86EMUL_PROPAGATE_FAULT;

+ kvm_vcpus_hang_on_page_start(vcpu->kvm, gpa_to_gfn(gpa));
kvm_x86_ops->patch_hypercall(vcpu, instruction);
+ kvm_vcpus_hang_on_page_end(vcpu->kvm);

return emulator_write_emulated(ctxt, rip, instruction, 3, NULL);
}
--
1.8.1.4


2013-06-09 08:45:42

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Sat, Jun 08, 2013 at 11:15:37AM +0800, Xiao Guangrong wrote:
> From: Xiao Guangrong <[email protected]>
>
> Currently, memory synchronization is missed in emulator_fix_hypercall,
> please see the commit 758ccc89b83
> (KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall)
>
> This patch fixes it by introducing kvm_vcpus_hang_on_page_start() and
> kvm_vcpus_hang_on_page_end which unmap the patched page from guest
> and use kvm_flush_remote_tlbs() as the serializing instruction to
> ensure the memory coherence
> [ The SDM said that INVEPT, INVVPID and MOV (to control register, with
> the exception of MOV CR8) are the serializing instructions. ]
>
> The mmu-lock is held during host patches the page so that it stops vcpus
> to fix its further page fault
>
I have a patch to implement is much simple and in generic way, not
relying on MMU internals.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6402951..49774ab 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5517,7 +5517,7 @@ out:
}
EXPORT_SYMBOL_GPL(kvm_emulate_hypercall);

-static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
+static int emulator_fix_hypercall_cb(void *ctxt)
{
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
char instruction[3];
@@ -5528,6 +5528,14 @@ static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
return emulator_write_emulated(ctxt, rip, instruction, 3, NULL);
}

+static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
+{
+ struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+ return kvm_exec_with_stopped_vcpu(vcpu->kvm,
+ emulator_fix_hypercall_cb, ctxt);
+}
+
+
/*
* Check if userspace requested an interrupt window, and that the
* interrupt window is open.
@@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
kvm_deliver_pmi(vcpu);
if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
vcpu_scan_ioapic(vcpu);
+ if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
+ mutex_lock(&vcpu->kvm->lock);
+ mutex_unlock(&vcpu->kvm->lock);
+ }
}

if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e3aae6d..6c9361a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -125,6 +125,7 @@ static inline bool is_error_page(struct page *page)
#define KVM_REQ_EPR_EXIT 20
#define KVM_REQ_SCAN_IOAPIC 21
#define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
+#define KVM_REQ_STOP_VCPU 23

#define KVM_USERSPACE_IRQ_SOURCE_ID 0
#define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1
@@ -576,6 +577,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm);
void kvm_reload_remote_mmus(struct kvm *kvm);
void kvm_make_mclock_inprogress_request(struct kvm *kvm);
void kvm_make_scan_ioapic_request(struct kvm *kvm);
+int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data);

long kvm_arch_dev_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1580dd4..531e765 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
}

+int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
+{
+ int r;
+
+ mutex_lock(&kvm->lock);
+ make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
+ r = cb(data);
+ mutex_unlock(&kvm->lock);
+
+ return r;
+}
+
int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
{
struct page *page;
--
Gleb.

2013-06-09 08:56:52

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 04:45 PM, Gleb Natapov wrote:
> On Sat, Jun 08, 2013 at 11:15:37AM +0800, Xiao Guangrong wrote:
>> From: Xiao Guangrong <[email protected]>
>>
>> Currently, memory synchronization is missed in emulator_fix_hypercall,
>> please see the commit 758ccc89b83
>> (KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall)
>>
>> This patch fixes it by introducing kvm_vcpus_hang_on_page_start() and
>> kvm_vcpus_hang_on_page_end which unmap the patched page from guest
>> and use kvm_flush_remote_tlbs() as the serializing instruction to
>> ensure the memory coherence
>> [ The SDM said that INVEPT, INVVPID and MOV (to control register, with
>> the exception of MOV CR8) are the serializing instructions. ]
>>
>> The mmu-lock is held during host patches the page so that it stops vcpus
>> to fix its further page fault
>>
> I have a patch to implement is much simple and in generic way, not
> relying on MMU internals.

I have considered this way but it seems not simple - it needs a new type of
request and it forces all vcpus to hang when host is patching the page.

My approach is just reusing the mmu code and requires vcpus to hang only when
the patched page is bing accessed.

2013-06-09 08:59:41

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Sun, Jun 09, 2013 at 04:56:42PM +0800, Xiao Guangrong wrote:
> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
> > On Sat, Jun 08, 2013 at 11:15:37AM +0800, Xiao Guangrong wrote:
> >> From: Xiao Guangrong <[email protected]>
> >>
> >> Currently, memory synchronization is missed in emulator_fix_hypercall,
> >> please see the commit 758ccc89b83
> >> (KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall)
> >>
> >> This patch fixes it by introducing kvm_vcpus_hang_on_page_start() and
> >> kvm_vcpus_hang_on_page_end which unmap the patched page from guest
> >> and use kvm_flush_remote_tlbs() as the serializing instruction to
> >> ensure the memory coherence
> >> [ The SDM said that INVEPT, INVVPID and MOV (to control register, with
> >> the exception of MOV CR8) are the serializing instructions. ]
> >>
> >> The mmu-lock is held during host patches the page so that it stops vcpus
> >> to fix its further page fault
> >>
> > I have a patch to implement is much simple and in generic way, not
> > relying on MMU internals.
>
> I have considered this way but it seems not simple - it needs a new type of
> request and it forces all vcpus to hang when host is patching the page.
>
> My approach is just reusing the mmu code and requires vcpus to hang only when
> the patched page is bing accessed.

That's very rare, no point to optimize this code path.

--
Gleb.

2013-06-09 09:08:43

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 04:59 PM, Gleb Natapov wrote:
> On Sun, Jun 09, 2013 at 04:56:42PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>> On Sat, Jun 08, 2013 at 11:15:37AM +0800, Xiao Guangrong wrote:
>>>> From: Xiao Guangrong <[email protected]>
>>>>
>>>> Currently, memory synchronization is missed in emulator_fix_hypercall,
>>>> please see the commit 758ccc89b83
>>>> (KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall)
>>>>
>>>> This patch fixes it by introducing kvm_vcpus_hang_on_page_start() and
>>>> kvm_vcpus_hang_on_page_end which unmap the patched page from guest
>>>> and use kvm_flush_remote_tlbs() as the serializing instruction to
>>>> ensure the memory coherence
>>>> [ The SDM said that INVEPT, INVVPID and MOV (to control register, with
>>>> the exception of MOV CR8) are the serializing instructions. ]
>>>>
>>>> The mmu-lock is held during host patches the page so that it stops vcpus
>>>> to fix its further page fault
>>>>
>>> I have a patch to implement is much simple and in generic way, not
>>> relying on MMU internals.
>>
>> I have considered this way but it seems not simple - it needs a new type of
>> request and it forces all vcpus to hang when host is patching the page.
>>
>> My approach is just reusing the mmu code and requires vcpus to hang only when
>> the patched page is bing accessed.
>
> That's very rare, no point to optimize this code path.

Fine to me. I do not have strong opinion on that, please apply your patch instead. ;)


2013-06-09 09:29:45

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 04:45 PM, Gleb Natapov wrote:

> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> +{
> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
> + emulator_fix_hypercall_cb, ctxt);
> +}
> +
> +
> /*
> * Check if userspace requested an interrupt window, and that the
> * interrupt window is open.
> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> kvm_deliver_pmi(vcpu);
> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
> vcpu_scan_ioapic(vcpu);
> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
> + mutex_lock(&vcpu->kvm->lock);
> + mutex_unlock(&vcpu->kvm->lock);

We should execute a serializing instruction here?

> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> }
>
> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
> +{
> + int r;
> +
> + mutex_lock(&kvm->lock);
> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
> + r = cb(data);

And here?

2013-06-09 09:39:14

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>
> > +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> > +{
> > + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> > + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
> > + emulator_fix_hypercall_cb, ctxt);
> > +}
> > +
> > +
> > /*
> > * Check if userspace requested an interrupt window, and that the
> > * interrupt window is open.
> > @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > kvm_deliver_pmi(vcpu);
> > if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
> > vcpu_scan_ioapic(vcpu);
> > + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
> > + mutex_lock(&vcpu->kvm->lock);
> > + mutex_unlock(&vcpu->kvm->lock);
>
> We should execute a serializing instruction here?
>
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> > make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> > }
> >
> > +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
> > +{
> > + int r;
> > +
> > + mutex_lock(&kvm->lock);
> > + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
> > + r = cb(data);
>
> And here?
Since the serialisation instruction the SDM suggest to use is CPUID I
think the point here is to flush CPU pipeline. Since all vcpus are out
of a guest mode I think out of order execution of modified instruction
is no an issue here.

--
Gleb.

2013-06-09 10:01:53

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 05:39 PM, Gleb Natapov wrote:
> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>
>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>> +{
>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
>>> + emulator_fix_hypercall_cb, ctxt);
>>> +}
>>> +
>>> +
>>> /*
>>> * Check if userspace requested an interrupt window, and that the
>>> * interrupt window is open.
>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>> kvm_deliver_pmi(vcpu);
>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>> vcpu_scan_ioapic(vcpu);
>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
>>> + mutex_lock(&vcpu->kvm->lock);
>>> + mutex_unlock(&vcpu->kvm->lock);
>>
>> We should execute a serializing instruction here?
>>
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>> }
>>>
>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
>>> +{
>>> + int r;
>>> +
>>> + mutex_lock(&kvm->lock);
>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
>>> + r = cb(data);
>>
>> And here?
> Since the serialisation instruction the SDM suggest to use is CPUID I
> think the point here is to flush CPU pipeline. Since all vcpus are out
> of a guest mode I think out of order execution of modified instruction
> is no an issue here.

I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
serializing instructions both in VM-Entry description and Instruction
reference, instead it said the VMX related serializing instructions are:
INVEPT, INVVPID.

So, i guess the explicit serializing instruction is needed here.

2013-06-09 10:19:39

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
> > On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
> >> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
> >>
> >>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> >>> +{
> >>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> >>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
> >>> + emulator_fix_hypercall_cb, ctxt);
> >>> +}
> >>> +
> >>> +
> >>> /*
> >>> * Check if userspace requested an interrupt window, and that the
> >>> * interrupt window is open.
> >>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>> kvm_deliver_pmi(vcpu);
> >>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
> >>> vcpu_scan_ioapic(vcpu);
> >>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
> >>> + mutex_lock(&vcpu->kvm->lock);
> >>> + mutex_unlock(&vcpu->kvm->lock);
> >>
> >> We should execute a serializing instruction here?
> >>
> >>> --- a/virt/kvm/kvm_main.c
> >>> +++ b/virt/kvm/kvm_main.c
> >>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> >>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> >>> }
> >>>
> >>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
> >>> +{
> >>> + int r;
> >>> +
> >>> + mutex_lock(&kvm->lock);
> >>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
> >>> + r = cb(data);
> >>
> >> And here?
> > Since the serialisation instruction the SDM suggest to use is CPUID I
> > think the point here is to flush CPU pipeline. Since all vcpus are out
> > of a guest mode I think out of order execution of modified instruction
> > is no an issue here.
>
> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
> serializing instructions both in VM-Entry description and Instruction
> reference, instead it said the VMX related serializing instructions are:
> INVEPT, INVVPID.
>
> So, i guess the explicit serializing instruction is needed here.
>
Again the question is what for? SDM says:

The Intel 64 and IA-32 architectures define several serializing
instructions. These instructions force the processor to complete all
modifications to flags, registers, and memory by previous instructions
and to drain all buffered writes to memory before the next instruction
is fetched and executed.

So flags and registers modifications on a host are obviously irrelevant for a guest.
And for memory ordering we have smp_mb() on a guest entry.


--
Gleb.

2013-06-09 11:25:25

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 06:19 PM, Gleb Natapov wrote:
> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>>>
>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>>>> +{
>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
>>>>> + emulator_fix_hypercall_cb, ctxt);
>>>>> +}
>>>>> +
>>>>> +
>>>>> /*
>>>>> * Check if userspace requested an interrupt window, and that the
>>>>> * interrupt window is open.
>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>> kvm_deliver_pmi(vcpu);
>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>>>> vcpu_scan_ioapic(vcpu);
>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
>>>>> + mutex_lock(&vcpu->kvm->lock);
>>>>> + mutex_unlock(&vcpu->kvm->lock);
>>>>
>>>> We should execute a serializing instruction here?
>>>>
>>>>> --- a/virt/kvm/kvm_main.c
>>>>> +++ b/virt/kvm/kvm_main.c
>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>>>> }
>>>>>
>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
>>>>> +{
>>>>> + int r;
>>>>> +
>>>>> + mutex_lock(&kvm->lock);
>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
>>>>> + r = cb(data);
>>>>
>>>> And here?
>>> Since the serialisation instruction the SDM suggest to use is CPUID I
>>> think the point here is to flush CPU pipeline. Since all vcpus are out
>>> of a guest mode I think out of order execution of modified instruction
>>> is no an issue here.
>>
>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
>> serializing instructions both in VM-Entry description and Instruction
>> reference, instead it said the VMX related serializing instructions are:
>> INVEPT, INVVPID.
>>
>> So, i guess the explicit serializing instruction is needed here.
>>
> Again the question is what for? SDM says:
>
> The Intel 64 and IA-32 architectures define several serializing
> instructions. These instructions force the processor to complete all
> modifications to flags, registers, and memory by previous instructions
> and to drain all buffered writes to memory before the next instruction
> is fetched and executed.
>
> So flags and registers modifications on a host are obviously irrelevant for a guest.

Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?

> And for memory ordering we have smp_mb() on a guest entry.

If i understand the SDM correctly, memory-ordering instructions can not drain
instruction buffer, it only drains "data memory subsystem":

"The following instructions are memory-ordering instructions, not serializing instruc-
tions. These drain the data memory subsystem. They do not serialize the instruction
execution stream:"

No?

2013-06-09 11:36:31

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
> > On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
> >> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
> >>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
> >>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
> >>>>
> >>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> >>>>> +{
> >>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> >>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
> >>>>> + emulator_fix_hypercall_cb, ctxt);
> >>>>> +}
> >>>>> +
> >>>>> +
> >>>>> /*
> >>>>> * Check if userspace requested an interrupt window, and that the
> >>>>> * interrupt window is open.
> >>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>>> kvm_deliver_pmi(vcpu);
> >>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
> >>>>> vcpu_scan_ioapic(vcpu);
> >>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
> >>>>> + mutex_lock(&vcpu->kvm->lock);
> >>>>> + mutex_unlock(&vcpu->kvm->lock);
> >>>>
> >>>> We should execute a serializing instruction here?
> >>>>
> >>>>> --- a/virt/kvm/kvm_main.c
> >>>>> +++ b/virt/kvm/kvm_main.c
> >>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> >>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> >>>>> }
> >>>>>
> >>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
> >>>>> +{
> >>>>> + int r;
> >>>>> +
> >>>>> + mutex_lock(&kvm->lock);
> >>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
> >>>>> + r = cb(data);
> >>>>
> >>>> And here?
> >>> Since the serialisation instruction the SDM suggest to use is CPUID I
> >>> think the point here is to flush CPU pipeline. Since all vcpus are out
> >>> of a guest mode I think out of order execution of modified instruction
> >>> is no an issue here.
> >>
> >> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
> >> serializing instructions both in VM-Entry description and Instruction
> >> reference, instead it said the VMX related serializing instructions are:
> >> INVEPT, INVVPID.
> >>
> >> So, i guess the explicit serializing instruction is needed here.
> >>
> > Again the question is what for? SDM says:
> >
> > The Intel 64 and IA-32 architectures define several serializing
> > instructions. These instructions force the processor to complete all
> > modifications to flags, registers, and memory by previous instructions
> > and to drain all buffered writes to memory before the next instruction
> > is fetched and executed.
> >
> > So flags and registers modifications on a host are obviously irrelevant for a guest.
>
> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
Memory barrier should guaranty that as I said bellow.

>
> > And for memory ordering we have smp_mb() on a guest entry.
>
> If i understand the SDM correctly, memory-ordering instructions can not drain
> instruction buffer, it only drains "data memory subsystem":
What is "instruction buffer"?

>
> "The following instructions are memory-ordering instructions, not serializing instruc-
> tions. These drain the data memory subsystem. They do not serialize the instruction
> execution stream:"
>
> No?
Yes, but we have no issue with instruction execution stream as I said
above. No guest instruction can be in a pipeline while all vcpus are in
a host.

--
Gleb.

2013-06-09 11:44:11

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 07:36 PM, Gleb Natapov wrote:
> On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
>>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
>>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
>>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
>>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>>>>>
>>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>>>>>> +{
>>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
>>>>>>> + emulator_fix_hypercall_cb, ctxt);
>>>>>>> +}
>>>>>>> +
>>>>>>> +
>>>>>>> /*
>>>>>>> * Check if userspace requested an interrupt window, and that the
>>>>>>> * interrupt window is open.
>>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>>>> kvm_deliver_pmi(vcpu);
>>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>>>>>> vcpu_scan_ioapic(vcpu);
>>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
>>>>>>> + mutex_lock(&vcpu->kvm->lock);
>>>>>>> + mutex_unlock(&vcpu->kvm->lock);
>>>>>>
>>>>>> We should execute a serializing instruction here?
>>>>>>
>>>>>>> --- a/virt/kvm/kvm_main.c
>>>>>>> +++ b/virt/kvm/kvm_main.c
>>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>>>>>> }
>>>>>>>
>>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
>>>>>>> +{
>>>>>>> + int r;
>>>>>>> +
>>>>>>> + mutex_lock(&kvm->lock);
>>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
>>>>>>> + r = cb(data);
>>>>>>
>>>>>> And here?
>>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
>>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
>>>>> of a guest mode I think out of order execution of modified instruction
>>>>> is no an issue here.
>>>>
>>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
>>>> serializing instructions both in VM-Entry description and Instruction
>>>> reference, instead it said the VMX related serializing instructions are:
>>>> INVEPT, INVVPID.
>>>>
>>>> So, i guess the explicit serializing instruction is needed here.
>>>>
>>> Again the question is what for? SDM says:
>>>
>>> The Intel 64 and IA-32 architectures define several serializing
>>> instructions. These instructions force the processor to complete all
>>> modifications to flags, registers, and memory by previous instructions
>>> and to drain all buffered writes to memory before the next instruction
>>> is fetched and executed.
>>>
>>> So flags and registers modifications on a host are obviously irrelevant for a guest.
>>
>> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
> Memory barrier should guaranty that as I said bellow.
>
>>
>>> And for memory ordering we have smp_mb() on a guest entry.
>>
>> If i understand the SDM correctly, memory-ordering instructions can not drain
>> instruction buffer, it only drains "data memory subsystem":
> What is "instruction buffer"?

I mean "Instruction Cache" (icache). Can memory ordering drain icache?
The "data memory subsystem" confused me, does it mean dcache?

>
>>
>> "The following instructions are memory-ordering instructions, not serializing instruc-
>> tions. These drain the data memory subsystem. They do not serialize the instruction
>> execution stream:"
>>
>> No?
> Yes, but we have no issue with instruction execution stream as I said
> above. No guest instruction can be in a pipeline while all vcpus are in
> a host.


2013-06-09 11:56:56

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Sun, Jun 09, 2013 at 07:44:03PM +0800, Xiao Guangrong wrote:
> On 06/09/2013 07:36 PM, Gleb Natapov wrote:
> > On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
> >> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
> >>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
> >>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
> >>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
> >>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
> >>>>>>
> >>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> >>>>>>> +{
> >>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> >>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
> >>>>>>> + emulator_fix_hypercall_cb, ctxt);
> >>>>>>> +}
> >>>>>>> +
> >>>>>>> +
> >>>>>>> /*
> >>>>>>> * Check if userspace requested an interrupt window, and that the
> >>>>>>> * interrupt window is open.
> >>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>>>>> kvm_deliver_pmi(vcpu);
> >>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
> >>>>>>> vcpu_scan_ioapic(vcpu);
> >>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
> >>>>>>> + mutex_lock(&vcpu->kvm->lock);
> >>>>>>> + mutex_unlock(&vcpu->kvm->lock);
> >>>>>>
> >>>>>> We should execute a serializing instruction here?
> >>>>>>
> >>>>>>> --- a/virt/kvm/kvm_main.c
> >>>>>>> +++ b/virt/kvm/kvm_main.c
> >>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> >>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> >>>>>>> }
> >>>>>>>
> >>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
> >>>>>>> +{
> >>>>>>> + int r;
> >>>>>>> +
> >>>>>>> + mutex_lock(&kvm->lock);
> >>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
> >>>>>>> + r = cb(data);
> >>>>>>
> >>>>>> And here?
> >>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
> >>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
> >>>>> of a guest mode I think out of order execution of modified instruction
> >>>>> is no an issue here.
> >>>>
> >>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
> >>>> serializing instructions both in VM-Entry description and Instruction
> >>>> reference, instead it said the VMX related serializing instructions are:
> >>>> INVEPT, INVVPID.
> >>>>
> >>>> So, i guess the explicit serializing instruction is needed here.
> >>>>
> >>> Again the question is what for? SDM says:
> >>>
> >>> The Intel 64 and IA-32 architectures define several serializing
> >>> instructions. These instructions force the processor to complete all
> >>> modifications to flags, registers, and memory by previous instructions
> >>> and to drain all buffered writes to memory before the next instruction
> >>> is fetched and executed.
> >>>
> >>> So flags and registers modifications on a host are obviously irrelevant for a guest.
> >>
> >> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
> > Memory barrier should guaranty that as I said bellow.
> >
> >>
> >>> And for memory ordering we have smp_mb() on a guest entry.
> >>
> >> If i understand the SDM correctly, memory-ordering instructions can not drain
> >> instruction buffer, it only drains "data memory subsystem":
> > What is "instruction buffer"?
>
> I mean "Instruction Cache" (icache). Can memory ordering drain icache?
> The "data memory subsystem" confused me, does it mean dcache?
>
I think it means all caches.
11.6 says:

A write to a memory location in a code segment that is currently
cached in the processor causes the associated cache line (or lines)
to be invalidated. This check is based on the physical address of
the instruction. In addition, the P6 family and Pentium processors
check whether a write to a code segment may modify an instruction that
has been prefetched for execution. If the write affects a prefetched
instruction, the prefetch queue is invalidated. This latter check is
based on the linear address of the instruction. For the Pentium 4 and
Intel Xeon processors, a write or a snoop of an instruction in a code
segment, where the target instruction is already decoded and resident in
the trace cache, invalidates the entire trace cache. The latter behavior
means that programs that self-modify code can cause severe degradation
of performance when run on the Pentium 4 and Intel Xeon processors.

So icache line is invalidate based on physical address so we are OK.
Prefetched instruction is invalidated based on linear address, but if
all vcpus are in a host guest instruction cannot be prefetched.

--
Gleb.

2013-06-09 12:17:27

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 07:56 PM, Gleb Natapov wrote:
> On Sun, Jun 09, 2013 at 07:44:03PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 07:36 PM, Gleb Natapov wrote:
>>> On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
>>>> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
>>>>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
>>>>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
>>>>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
>>>>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>>>>>>>
>>>>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>>>>>>>> +{
>>>>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>>>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
>>>>>>>>> + emulator_fix_hypercall_cb, ctxt);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +
>>>>>>>>> /*
>>>>>>>>> * Check if userspace requested an interrupt window, and that the
>>>>>>>>> * interrupt window is open.
>>>>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>>>>>> kvm_deliver_pmi(vcpu);
>>>>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>>>>>>>> vcpu_scan_ioapic(vcpu);
>>>>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
>>>>>>>>> + mutex_lock(&vcpu->kvm->lock);
>>>>>>>>> + mutex_unlock(&vcpu->kvm->lock);
>>>>>>>>
>>>>>>>> We should execute a serializing instruction here?
>>>>>>>>
>>>>>>>>> --- a/virt/kvm/kvm_main.c
>>>>>>>>> +++ b/virt/kvm/kvm_main.c
>>>>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>>>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
>>>>>>>>> +{
>>>>>>>>> + int r;
>>>>>>>>> +
>>>>>>>>> + mutex_lock(&kvm->lock);
>>>>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
>>>>>>>>> + r = cb(data);
>>>>>>>>
>>>>>>>> And here?
>>>>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
>>>>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
>>>>>>> of a guest mode I think out of order execution of modified instruction
>>>>>>> is no an issue here.
>>>>>>
>>>>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
>>>>>> serializing instructions both in VM-Entry description and Instruction
>>>>>> reference, instead it said the VMX related serializing instructions are:
>>>>>> INVEPT, INVVPID.
>>>>>>
>>>>>> So, i guess the explicit serializing instruction is needed here.
>>>>>>
>>>>> Again the question is what for? SDM says:
>>>>>
>>>>> The Intel 64 and IA-32 architectures define several serializing
>>>>> instructions. These instructions force the processor to complete all
>>>>> modifications to flags, registers, and memory by previous instructions
>>>>> and to drain all buffered writes to memory before the next instruction
>>>>> is fetched and executed.
>>>>>
>>>>> So flags and registers modifications on a host are obviously irrelevant for a guest.
>>>>
>>>> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
>>> Memory barrier should guaranty that as I said bellow.
>>>
>>>>
>>>>> And for memory ordering we have smp_mb() on a guest entry.
>>>>
>>>> If i understand the SDM correctly, memory-ordering instructions can not drain
>>>> instruction buffer, it only drains "data memory subsystem":
>>> What is "instruction buffer"?
>>
>> I mean "Instruction Cache" (icache). Can memory ordering drain icache?
>> The "data memory subsystem" confused me, does it mean dcache?
>>
> I think it means all caches.
> 11.6 says:
>
> A write to a memory location in a code segment that is currently
> cached in the processor causes the associated cache line (or lines)
> to be invalidated. This check is based on the physical address of
> the instruction. In addition, the P6 family and Pentium processors
> check whether a write to a code segment may modify an instruction that
> has been prefetched for execution. If the write affects a prefetched
> instruction, the prefetch queue is invalidated. This latter check is
> based on the linear address of the instruction. For the Pentium 4 and
> Intel Xeon processors, a write or a snoop of an instruction in a code
> segment, where the target instruction is already decoded and resident in
> the trace cache, invalidates the entire trace cache. The latter behavior
> means that programs that self-modify code can cause severe degradation
> of performance when run on the Pentium 4 and Intel Xeon processors.
>
> So icache line is invalidate based on physical address so we are OK.

Yes.

> Prefetched instruction is invalidated based on linear address, but if
> all vcpus are in a host guest instruction cannot be prefetched.

But what happen if the instruction has been prefetched before vcpu exits
to host? Then, after returns to guest, it executes the old instruction.

Can it happen?

2013-06-09 12:27:33

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Sun, Jun 09, 2013 at 08:17:19PM +0800, Xiao Guangrong wrote:
> On 06/09/2013 07:56 PM, Gleb Natapov wrote:
> > On Sun, Jun 09, 2013 at 07:44:03PM +0800, Xiao Guangrong wrote:
> >> On 06/09/2013 07:36 PM, Gleb Natapov wrote:
> >>> On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
> >>>> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
> >>>>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
> >>>>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
> >>>>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
> >>>>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
> >>>>>>>>
> >>>>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> >>>>>>>>> +{
> >>>>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> >>>>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
> >>>>>>>>> + emulator_fix_hypercall_cb, ctxt);
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>> +
> >>>>>>>>> /*
> >>>>>>>>> * Check if userspace requested an interrupt window, and that the
> >>>>>>>>> * interrupt window is open.
> >>>>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>>>>>>> kvm_deliver_pmi(vcpu);
> >>>>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
> >>>>>>>>> vcpu_scan_ioapic(vcpu);
> >>>>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
> >>>>>>>>> + mutex_lock(&vcpu->kvm->lock);
> >>>>>>>>> + mutex_unlock(&vcpu->kvm->lock);
> >>>>>>>>
> >>>>>>>> We should execute a serializing instruction here?
> >>>>>>>>
> >>>>>>>>> --- a/virt/kvm/kvm_main.c
> >>>>>>>>> +++ b/virt/kvm/kvm_main.c
> >>>>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> >>>>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
> >>>>>>>>> +{
> >>>>>>>>> + int r;
> >>>>>>>>> +
> >>>>>>>>> + mutex_lock(&kvm->lock);
> >>>>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
> >>>>>>>>> + r = cb(data);
> >>>>>>>>
> >>>>>>>> And here?
> >>>>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
> >>>>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
> >>>>>>> of a guest mode I think out of order execution of modified instruction
> >>>>>>> is no an issue here.
> >>>>>>
> >>>>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
> >>>>>> serializing instructions both in VM-Entry description and Instruction
> >>>>>> reference, instead it said the VMX related serializing instructions are:
> >>>>>> INVEPT, INVVPID.
> >>>>>>
> >>>>>> So, i guess the explicit serializing instruction is needed here.
> >>>>>>
> >>>>> Again the question is what for? SDM says:
> >>>>>
> >>>>> The Intel 64 and IA-32 architectures define several serializing
> >>>>> instructions. These instructions force the processor to complete all
> >>>>> modifications to flags, registers, and memory by previous instructions
> >>>>> and to drain all buffered writes to memory before the next instruction
> >>>>> is fetched and executed.
> >>>>>
> >>>>> So flags and registers modifications on a host are obviously irrelevant for a guest.
> >>>>
> >>>> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
> >>> Memory barrier should guaranty that as I said bellow.
> >>>
> >>>>
> >>>>> And for memory ordering we have smp_mb() on a guest entry.
> >>>>
> >>>> If i understand the SDM correctly, memory-ordering instructions can not drain
> >>>> instruction buffer, it only drains "data memory subsystem":
> >>> What is "instruction buffer"?
> >>
> >> I mean "Instruction Cache" (icache). Can memory ordering drain icache?
> >> The "data memory subsystem" confused me, does it mean dcache?
> >>
> > I think it means all caches.
> > 11.6 says:
> >
> > A write to a memory location in a code segment that is currently
> > cached in the processor causes the associated cache line (or lines)
> > to be invalidated. This check is based on the physical address of
> > the instruction. In addition, the P6 family and Pentium processors
> > check whether a write to a code segment may modify an instruction that
> > has been prefetched for execution. If the write affects a prefetched
> > instruction, the prefetch queue is invalidated. This latter check is
> > based on the linear address of the instruction. For the Pentium 4 and
> > Intel Xeon processors, a write or a snoop of an instruction in a code
> > segment, where the target instruction is already decoded and resident in
> > the trace cache, invalidates the entire trace cache. The latter behavior
> > means that programs that self-modify code can cause severe degradation
> > of performance when run on the Pentium 4 and Intel Xeon processors.
> >
> > So icache line is invalidate based on physical address so we are OK.
>
> Yes.
>
> > Prefetched instruction is invalidated based on linear address, but if
> > all vcpus are in a host guest instruction cannot be prefetched.
>
> But what happen if the instruction has been prefetched before vcpu exits
> to host? Then, after returns to guest, it executes the old instruction.
>
> Can it happen?
I do not thing so, prefetched instructions is not a cache, but I'll ask
Intel.

--
Gleb.

2013-06-09 12:52:12

by Xiao Guangrong

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On 06/09/2013 08:27 PM, Gleb Natapov wrote:
> On Sun, Jun 09, 2013 at 08:17:19PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 07:56 PM, Gleb Natapov wrote:
>>> On Sun, Jun 09, 2013 at 07:44:03PM +0800, Xiao Guangrong wrote:
>>>> On 06/09/2013 07:36 PM, Gleb Natapov wrote:
>>>>> On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
>>>>>> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
>>>>>>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
>>>>>>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
>>>>>>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
>>>>>>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>>>>>>>>>
>>>>>>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>>>>>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
>>>>>>>>>>> + emulator_fix_hypercall_cb, ctxt);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +
>>>>>>>>>>> /*
>>>>>>>>>>> * Check if userspace requested an interrupt window, and that the
>>>>>>>>>>> * interrupt window is open.
>>>>>>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>>>>>>>> kvm_deliver_pmi(vcpu);
>>>>>>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>>>>>>>>>> vcpu_scan_ioapic(vcpu);
>>>>>>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
>>>>>>>>>>> + mutex_lock(&vcpu->kvm->lock);
>>>>>>>>>>> + mutex_unlock(&vcpu->kvm->lock);
>>>>>>>>>>
>>>>>>>>>> We should execute a serializing instruction here?
>>>>>>>>>>
>>>>>>>>>>> --- a/virt/kvm/kvm_main.c
>>>>>>>>>>> +++ b/virt/kvm/kvm_main.c
>>>>>>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>>>>>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
>>>>>>>>>>> +{
>>>>>>>>>>> + int r;
>>>>>>>>>>> +
>>>>>>>>>>> + mutex_lock(&kvm->lock);
>>>>>>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
>>>>>>>>>>> + r = cb(data);
>>>>>>>>>>
>>>>>>>>>> And here?
>>>>>>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
>>>>>>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
>>>>>>>>> of a guest mode I think out of order execution of modified instruction
>>>>>>>>> is no an issue here.
>>>>>>>>
>>>>>>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
>>>>>>>> serializing instructions both in VM-Entry description and Instruction
>>>>>>>> reference, instead it said the VMX related serializing instructions are:
>>>>>>>> INVEPT, INVVPID.
>>>>>>>>
>>>>>>>> So, i guess the explicit serializing instruction is needed here.
>>>>>>>>
>>>>>>> Again the question is what for? SDM says:
>>>>>>>
>>>>>>> The Intel 64 and IA-32 architectures define several serializing
>>>>>>> instructions. These instructions force the processor to complete all
>>>>>>> modifications to flags, registers, and memory by previous instructions
>>>>>>> and to drain all buffered writes to memory before the next instruction
>>>>>>> is fetched and executed.
>>>>>>>
>>>>>>> So flags and registers modifications on a host are obviously irrelevant for a guest.
>>>>>>
>>>>>> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
>>>>> Memory barrier should guaranty that as I said bellow.
>>>>>
>>>>>>
>>>>>>> And for memory ordering we have smp_mb() on a guest entry.
>>>>>>
>>>>>> If i understand the SDM correctly, memory-ordering instructions can not drain
>>>>>> instruction buffer, it only drains "data memory subsystem":
>>>>> What is "instruction buffer"?
>>>>
>>>> I mean "Instruction Cache" (icache). Can memory ordering drain icache?
>>>> The "data memory subsystem" confused me, does it mean dcache?
>>>>
>>> I think it means all caches.
>>> 11.6 says:
>>>
>>> A write to a memory location in a code segment that is currently
>>> cached in the processor causes the associated cache line (or lines)
>>> to be invalidated. This check is based on the physical address of
>>> the instruction. In addition, the P6 family and Pentium processors
>>> check whether a write to a code segment may modify an instruction that
>>> has been prefetched for execution. If the write affects a prefetched
>>> instruction, the prefetch queue is invalidated. This latter check is
>>> based on the linear address of the instruction. For the Pentium 4 and
>>> Intel Xeon processors, a write or a snoop of an instruction in a code
>>> segment, where the target instruction is already decoded and resident in
>>> the trace cache, invalidates the entire trace cache. The latter behavior
>>> means that programs that self-modify code can cause severe degradation
>>> of performance when run on the Pentium 4 and Intel Xeon processors.
>>>
>>> So icache line is invalidate based on physical address so we are OK.
>>
>> Yes.
>>
>>> Prefetched instruction is invalidated based on linear address, but if
>>> all vcpus are in a host guest instruction cannot be prefetched.
>>
>> But what happen if the instruction has been prefetched before vcpu exits
>> to host? Then, after returns to guest, it executes the old instruction.
>>
>> Can it happen?
> I do not thing so, prefetched instructions is not a cache, but I'll ask
> Intel.

Okay, thanks very much for your patience, Gleb!

2013-06-18 14:13:11

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

Il 09/06/2013 14:27, Gleb Natapov ha scritto:
> On Sun, Jun 09, 2013 at 08:17:19PM +0800, Xiao Guangrong wrote:
>> On 06/09/2013 07:56 PM, Gleb Natapov wrote:
>>> On Sun, Jun 09, 2013 at 07:44:03PM +0800, Xiao Guangrong wrote:
>>>> On 06/09/2013 07:36 PM, Gleb Natapov wrote:
>>>>> On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
>>>>>> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
>>>>>>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
>>>>>>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
>>>>>>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
>>>>>>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
>>>>>>>>>>
>>>>>>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>>>>>>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
>>>>>>>>>>> + emulator_fix_hypercall_cb, ctxt);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +
>>>>>>>>>>> /*
>>>>>>>>>>> * Check if userspace requested an interrupt window, and that the
>>>>>>>>>>> * interrupt window is open.
>>>>>>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>>>>>>>>> kvm_deliver_pmi(vcpu);
>>>>>>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>>>>>>>>>> vcpu_scan_ioapic(vcpu);
>>>>>>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
>>>>>>>>>>> + mutex_lock(&vcpu->kvm->lock);
>>>>>>>>>>> + mutex_unlock(&vcpu->kvm->lock);
>>>>>>>>>>
>>>>>>>>>> We should execute a serializing instruction here?
>>>>>>>>>>
>>>>>>>>>>> --- a/virt/kvm/kvm_main.c
>>>>>>>>>>> +++ b/virt/kvm/kvm_main.c
>>>>>>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>>>>>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
>>>>>>>>>>> +{
>>>>>>>>>>> + int r;
>>>>>>>>>>> +
>>>>>>>>>>> + mutex_lock(&kvm->lock);
>>>>>>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
>>>>>>>>>>> + r = cb(data);
>>>>>>>>>>
>>>>>>>>>> And here?
>>>>>>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
>>>>>>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
>>>>>>>>> of a guest mode I think out of order execution of modified instruction
>>>>>>>>> is no an issue here.
>>>>>>>>
>>>>>>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
>>>>>>>> serializing instructions both in VM-Entry description and Instruction
>>>>>>>> reference, instead it said the VMX related serializing instructions are:
>>>>>>>> INVEPT, INVVPID.
>>>>>>>>
>>>>>>>> So, i guess the explicit serializing instruction is needed here.
>>>>>>>>
>>>>>>> Again the question is what for? SDM says:
>>>>>>>
>>>>>>> The Intel 64 and IA-32 architectures define several serializing
>>>>>>> instructions. These instructions force the processor to complete all
>>>>>>> modifications to flags, registers, and memory by previous instructions
>>>>>>> and to drain all buffered writes to memory before the next instruction
>>>>>>> is fetched and executed.
>>>>>>>
>>>>>>> So flags and registers modifications on a host are obviously irrelevant for a guest.
>>>>>>
>>>>>> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
>>>>> Memory barrier should guaranty that as I said bellow.
>>>>>
>>>>>>
>>>>>>> And for memory ordering we have smp_mb() on a guest entry.
>>>>>>
>>>>>> If i understand the SDM correctly, memory-ordering instructions can not drain
>>>>>> instruction buffer, it only drains "data memory subsystem":
>>>>> What is "instruction buffer"?
>>>>
>>>> I mean "Instruction Cache" (icache). Can memory ordering drain icache?
>>>> The "data memory subsystem" confused me, does it mean dcache?
>>>>
>>> I think it means all caches.
>>> 11.6 says:
>>>
>>> A write to a memory location in a code segment that is currently
>>> cached in the processor causes the associated cache line (or lines)
>>> to be invalidated. This check is based on the physical address of
>>> the instruction. In addition, the P6 family and Pentium processors
>>> check whether a write to a code segment may modify an instruction that
>>> has been prefetched for execution. If the write affects a prefetched
>>> instruction, the prefetch queue is invalidated. This latter check is
>>> based on the linear address of the instruction. For the Pentium 4 and
>>> Intel Xeon processors, a write or a snoop of an instruction in a code
>>> segment, where the target instruction is already decoded and resident in
>>> the trace cache, invalidates the entire trace cache. The latter behavior
>>> means that programs that self-modify code can cause severe degradation
>>> of performance when run on the Pentium 4 and Intel Xeon processors.
>>>
>>> So icache line is invalidate based on physical address so we are OK.
>>
>> Yes.
>>
>>> Prefetched instruction is invalidated based on linear address, but if
>>> all vcpus are in a host guest instruction cannot be prefetched.
>>
>> But what happen if the instruction has been prefetched before vcpu exits
>> to host? Then, after returns to guest, it executes the old instruction.
>>
>> Can it happen?
> I do not thing so, prefetched instructions is not a cache, but I'll ask
> Intel.

Any news?

Anyway, if this were the case (which seems strange, but you never know),
CPUID would not help. The hypothetical guest prefetch queue would not
be flushed, and you'd need INVEPT/INVVPID as Xiao mentioned upthread.

Paolo

2013-06-18 15:22:33

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: fix missed memory synchronization when patch hypercall

On Tue, Jun 18, 2013 at 04:13:02PM +0200, Paolo Bonzini wrote:
> Il 09/06/2013 14:27, Gleb Natapov ha scritto:
> > On Sun, Jun 09, 2013 at 08:17:19PM +0800, Xiao Guangrong wrote:
> >> On 06/09/2013 07:56 PM, Gleb Natapov wrote:
> >>> On Sun, Jun 09, 2013 at 07:44:03PM +0800, Xiao Guangrong wrote:
> >>>> On 06/09/2013 07:36 PM, Gleb Natapov wrote:
> >>>>> On Sun, Jun 09, 2013 at 07:25:17PM +0800, Xiao Guangrong wrote:
> >>>>>> On 06/09/2013 06:19 PM, Gleb Natapov wrote:
> >>>>>>> On Sun, Jun 09, 2013 at 06:01:45PM +0800, Xiao Guangrong wrote:
> >>>>>>>> On 06/09/2013 05:39 PM, Gleb Natapov wrote:
> >>>>>>>>> On Sun, Jun 09, 2013 at 05:29:37PM +0800, Xiao Guangrong wrote:
> >>>>>>>>>> On 06/09/2013 04:45 PM, Gleb Natapov wrote:
> >>>>>>>>>>
> >>>>>>>>>>> +static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
> >>>>>>>>>>> +{
> >>>>>>>>>>> + struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> >>>>>>>>>>> + return kvm_exec_with_stopped_vcpu(vcpu->kvm,
> >>>>>>>>>>> + emulator_fix_hypercall_cb, ctxt);
> >>>>>>>>>>> +}
> >>>>>>>>>>> +
> >>>>>>>>>>> +
> >>>>>>>>>>> /*
> >>>>>>>>>>> * Check if userspace requested an interrupt window, and that the
> >>>>>>>>>>> * interrupt window is open.
> >>>>>>>>>>> @@ -5761,6 +5769,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>>>>>>>>> kvm_deliver_pmi(vcpu);
> >>>>>>>>>>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
> >>>>>>>>>>> vcpu_scan_ioapic(vcpu);
> >>>>>>>>>>> + if (kvm_check_request(KVM_REQ_STOP_VCPU, vcpu)){
> >>>>>>>>>>> + mutex_lock(&vcpu->kvm->lock);
> >>>>>>>>>>> + mutex_unlock(&vcpu->kvm->lock);
> >>>>>>>>>>
> >>>>>>>>>> We should execute a serializing instruction here?
> >>>>>>>>>>
> >>>>>>>>>>> --- a/virt/kvm/kvm_main.c
> >>>>>>>>>>> +++ b/virt/kvm/kvm_main.c
> >>>>>>>>>>> @@ -222,6 +222,18 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> >>>>>>>>>>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>> +int kvm_exec_with_stopped_vcpu(struct kvm *kvm, int (*cb)(void *), void *data)
> >>>>>>>>>>> +{
> >>>>>>>>>>> + int r;
> >>>>>>>>>>> +
> >>>>>>>>>>> + mutex_lock(&kvm->lock);
> >>>>>>>>>>> + make_all_cpus_request(kvm, KVM_REQ_STOP_VCPU);
> >>>>>>>>>>> + r = cb(data);
> >>>>>>>>>>
> >>>>>>>>>> And here?
> >>>>>>>>> Since the serialisation instruction the SDM suggest to use is CPUID I
> >>>>>>>>> think the point here is to flush CPU pipeline. Since all vcpus are out
> >>>>>>>>> of a guest mode I think out of order execution of modified instruction
> >>>>>>>>> is no an issue here.
> >>>>>>>>
> >>>>>>>> I checked the SDM that it did not said VMLAUNCH/VMRESUME are the
> >>>>>>>> serializing instructions both in VM-Entry description and Instruction
> >>>>>>>> reference, instead it said the VMX related serializing instructions are:
> >>>>>>>> INVEPT, INVVPID.
> >>>>>>>>
> >>>>>>>> So, i guess the explicit serializing instruction is needed here.
> >>>>>>>>
> >>>>>>> Again the question is what for? SDM says:
> >>>>>>>
> >>>>>>> The Intel 64 and IA-32 architectures define several serializing
> >>>>>>> instructions. These instructions force the processor to complete all
> >>>>>>> modifications to flags, registers, and memory by previous instructions
> >>>>>>> and to drain all buffered writes to memory before the next instruction
> >>>>>>> is fetched and executed.
> >>>>>>>
> >>>>>>> So flags and registers modifications on a host are obviously irrelevant for a guest.
> >>>>>>
> >>>>>> Okay. Hmm... but what can guarantee that "drain all buffered writes to memory"?
> >>>>> Memory barrier should guaranty that as I said bellow.
> >>>>>
> >>>>>>
> >>>>>>> And for memory ordering we have smp_mb() on a guest entry.
> >>>>>>
> >>>>>> If i understand the SDM correctly, memory-ordering instructions can not drain
> >>>>>> instruction buffer, it only drains "data memory subsystem":
> >>>>> What is "instruction buffer"?
> >>>>
> >>>> I mean "Instruction Cache" (icache). Can memory ordering drain icache?
> >>>> The "data memory subsystem" confused me, does it mean dcache?
> >>>>
> >>> I think it means all caches.
> >>> 11.6 says:
> >>>
> >>> A write to a memory location in a code segment that is currently
> >>> cached in the processor causes the associated cache line (or lines)
> >>> to be invalidated. This check is based on the physical address of
> >>> the instruction. In addition, the P6 family and Pentium processors
> >>> check whether a write to a code segment may modify an instruction that
> >>> has been prefetched for execution. If the write affects a prefetched
> >>> instruction, the prefetch queue is invalidated. This latter check is
> >>> based on the linear address of the instruction. For the Pentium 4 and
> >>> Intel Xeon processors, a write or a snoop of an instruction in a code
> >>> segment, where the target instruction is already decoded and resident in
> >>> the trace cache, invalidates the entire trace cache. The latter behavior
> >>> means that programs that self-modify code can cause severe degradation
> >>> of performance when run on the Pentium 4 and Intel Xeon processors.
> >>>
> >>> So icache line is invalidate based on physical address so we are OK.
> >>
> >> Yes.
> >>
> >>> Prefetched instruction is invalidated based on linear address, but if
> >>> all vcpus are in a host guest instruction cannot be prefetched.
> >>
> >> But what happen if the instruction has been prefetched before vcpu exits
> >> to host? Then, after returns to guest, it executes the old instruction.
> >>
> >> Can it happen?
> > I do not thing so, prefetched instructions is not a cache, but I'll ask
> > Intel.
>
> Any news?
>
Not yet.

> Anyway, if this were the case (which seems strange, but you never know),
> CPUID would not help. The hypothetical guest prefetch queue would not
> be flushed, and you'd need INVEPT/INVVPID as Xiao mentioned upthread.
Do not see why INVEPT/INVVPID is relevant. There is no issue with TLB
here.

--
Gleb.