LinuxLists.cc - Re: [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes

2024-03-12 01:11:25

Subject: Re: [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes

On 3/8/24 17:09, Sean Christopherson wrote:
> Remove KVM's support for virtualizing guest MTRR memtypes, as full MTRR
> adds no value, negatively impacts guest performance, and is a maintenance
> burden due to it's complexity and oddities.
>
> KVM's approach to virtualizating MTRRs make no sense, at all. KVM *only*
> honors guest MTRR memtypes if EPT is enabled *and* the guest has a device
> that may perform non-coherent DMA access. From a hardware virtualization
> perspective of guest MTRRs, there is _nothing_ special about EPT. Legacy
> shadowing paging doesn't magically account for guest MTRRs, nor does NPT.

[snip]

>
> -bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma)
> +bool kvm_mmu_may_ignore_guest_pat(void)
> {
> /*
> - * If host MTRRs are ignored (shadow_memtype_mask is non-zero), and the
> - * VM has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is
> - * to honor the memtype from the guest's MTRRs so that guest accesses
> - * to memory that is DMA'd aren't cached against the guest's wishes.
> - *
> - * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs,
> - * e.g. KVM will force UC memtype for host MMIO.
> + * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
> + * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
> + * honor the memtype from the guest's PAT so that guest accesses to
> + * memory that is DMA'd aren't cached against the guest's wishes. As a
> + * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
> + * KVM _always_ ignores guest PAT (when EPT is enabled).
> */
> - return vm_has_noncoherent_dma && shadow_memtype_mask;
> + return shadow_memtype_mask;
> }
>

Any special reason to use the naming 'may_ignore_guest_pat', but not
'may_honor_guest_pat'?

Since it is also controlled by other cases, e.g., kvm_arch_has_noncoherent_dma()
at vmx_get_mt_mask(), it can be 'may_honor_guest_pat' too?

Therefore, why not directly use 'shadow_memtype_mask' (without the API), or some
naming like "ept_enabled_for_hardware".

Even with the code from PATCH 5/5, we still have high chance that VM has
non-coherent DMA?

bool kvm_mmu_may_ignore_guest_pat(void)
{
/*
- * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
+ * When EPT is enabled (shadow_memtype_mask is non-zero), the CPU does
+ * not support self-snoop (or is affected by an erratum), and the VM
* has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
* honor the memtype from the guest's PAT so that guest accesses to
* memory that is DMA'd aren't cached against the guest's wishes. As a
* result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
- * KVM _always_ ignores guest PAT (when EPT is enabled).
+ * KVM _always_ ignores or honors guest PAT, i.e. doesn't toggle SPTE
+ * bits in response to non-coherent device (un)registration.
*/
- return shadow_memtype_mask;
+ return !static_cpu_has(X86_FEATURE_SELFSNOOP) && shadow_memtype_mask;
}

Thank you very much!

Dongli Zhang

2024-03-12 17:21:32

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes

On Mon, Mar 11, 2024, Dongli Zhang wrote:
>
>
> On 3/8/24 17:09, Sean Christopherson wrote:
> > Remove KVM's support for virtualizing guest MTRR memtypes, as full MTRR
> > adds no value, negatively impacts guest performance, and is a maintenance
> > burden due to it's complexity and oddities.
> >
> > KVM's approach to virtualizating MTRRs make no sense, at all. KVM *only*
> > honors guest MTRR memtypes if EPT is enabled *and* the guest has a device
> > that may perform non-coherent DMA access. From a hardware virtualization
> > perspective of guest MTRRs, there is _nothing_ special about EPT. Legacy
> > shadowing paging doesn't magically account for guest MTRRs, nor does NPT.
>
> [snip]
>
> >
> > -bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma)
> > +bool kvm_mmu_may_ignore_guest_pat(void)
> > {
> > /*
> > - * If host MTRRs are ignored (shadow_memtype_mask is non-zero), and the
> > - * VM has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is
> > - * to honor the memtype from the guest's MTRRs so that guest accesses
> > - * to memory that is DMA'd aren't cached against the guest's wishes.
> > - *
> > - * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs,
> > - * e.g. KVM will force UC memtype for host MMIO.
> > + * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
> > + * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
> > + * honor the memtype from the guest's PAT so that guest accesses to
> > + * memory that is DMA'd aren't cached against the guest's wishes. As a
> > + * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
> > + * KVM _always_ ignores guest PAT (when EPT is enabled).
> > */
> > - return vm_has_noncoherent_dma && shadow_memtype_mask;
> > + return shadow_memtype_mask;
> > }
> >
>
> Any special reason to use the naming 'may_ignore_guest_pat', but not
> 'may_honor_guest_pat'?

Because which (after this series) is would either be misleading or outright wrong.
If KVM returns true from the helper based solely on shadow_memtype_mask, then it's
misleading because KVM will *always* honors guest PAT for such CPUs. I.e. that
name would yield this misleading statement.

If the CPU supports self-snoop, KVM may honor guest PAT.

If KVM returns true iff self-snoop is NOT available (as proposed in this series),
then it's outright wrong as KVM would return false, i.e. would make this incorrect
statement:

If the CPU supports self-snoop, KVM never honors guest PAT.

As saying that KVM may not or cannot do something is saying that KVM will never
do that thing.

And because the EPT flag is "ignore guest PAT", not "honor guest PAT", but that's
as much coincidence as it is anything else.

> Since it is also controlled by other cases, e.g., kvm_arch_has_noncoherent_dma()
> at vmx_get_mt_mask(), it can be 'may_honor_guest_pat' too?
>
> Therefore, why not directly use 'shadow_memtype_mask' (without the API), or some
> naming like "ept_enabled_for_hardware".

Again, after this series, KVM will *always* honor guest PAT for CPUs with self-snoop,
i.e. KVM will *never* ignore guest PAT. But for CPUs without self-snoop (or with
errata), KVM conditionally honors/ignores guest PAT.

> Even with the code from PATCH 5/5, we still have high chance that VM has
> non-coherent DMA?

I don't follow. On CPUs with self-snoop, whether or not the VM has non-coherent
DMA (from VFIO!) is irrelevant. If the CPU has self-snoop, then KVM can safely
honor guest PAT at all times.

> bool kvm_mmu_may_ignore_guest_pat(void)
> {
> /*
> - * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
> + * When EPT is enabled (shadow_memtype_mask is non-zero), the CPU does
> + * not support self-snoop (or is affected by an erratum), and the VM
> * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
> * honor the memtype from the guest's PAT so that guest accesses to
> * memory that is DMA'd aren't cached against the guest's wishes. As a
> * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
> - * KVM _always_ ignores guest PAT (when EPT is enabled).
> + * KVM _always_ ignores or honors guest PAT, i.e. doesn't toggle SPTE
> + * bits in response to non-coherent device (un)registration.
> */
> - return shadow_memtype_mask;
> + return !static_cpu_has(X86_FEATURE_SELFSNOOP) && shadow_memtype_mask;
> }
>
>
> Thank you very much!
>
> Dongli Zhang

2024-03-14 10:31:41

by Dongli Zhang

[permalink] [raw]

Subject: Re: [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes

On 3/12/24 10:08, Sean Christopherson wrote:
> On Mon, Mar 11, 2024, Dongli Zhang wrote:
>>
>>
>> On 3/8/24 17:09, Sean Christopherson wrote:
>>> Remove KVM's support for virtualizing guest MTRR memtypes, as full MTRR
>>> adds no value, negatively impacts guest performance, and is a maintenance
>>> burden due to it's complexity and oddities.
>>>
>>> KVM's approach to virtualizating MTRRs make no sense, at all. KVM *only*
>>> honors guest MTRR memtypes if EPT is enabled *and* the guest has a device
>>> that may perform non-coherent DMA access. From a hardware virtualization
>>> perspective of guest MTRRs, there is _nothing_ special about EPT. Legacy
>>> shadowing paging doesn't magically account for guest MTRRs, nor does NPT.
>>
>> [snip]
>>
>>>
>>> -bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma)
>>> +bool kvm_mmu_may_ignore_guest_pat(void)
>>> {
>>> /*
>>> - * If host MTRRs are ignored (shadow_memtype_mask is non-zero), and the
>>> - * VM has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is
>>> - * to honor the memtype from the guest's MTRRs so that guest accesses
>>> - * to memory that is DMA'd aren't cached against the guest's wishes.
>>> - *
>>> - * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs,
>>> - * e.g. KVM will force UC memtype for host MMIO.
>>> + * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
>>> + * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
>>> + * honor the memtype from the guest's PAT so that guest accesses to
>>> + * memory that is DMA'd aren't cached against the guest's wishes. As a
>>> + * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
>>> + * KVM _always_ ignores guest PAT (when EPT is enabled).
>>> */
>>> - return vm_has_noncoherent_dma && shadow_memtype_mask;
>>> + return shadow_memtype_mask;
>>> }
>>>
>>
>> Any special reason to use the naming 'may_ignore_guest_pat', but not
>> 'may_honor_guest_pat'?
>
> Because which (after this series) is would either be misleading or outright wrong.
> If KVM returns true from the helper based solely on shadow_memtype_mask, then it's
> misleading because KVM will *always* honors guest PAT for such CPUs. I.e. that
> name would yield this misleading statement.
>
> If the CPU supports self-snoop, KVM may honor guest PAT.
>
> If KVM returns true iff self-snoop is NOT available (as proposed in this series),
> then it's outright wrong as KVM would return false, i.e. would make this incorrect
> statement:
>
> If the CPU supports self-snoop, KVM never honors guest PAT.
>
> As saying that KVM may not or cannot do something is saying that KVM will never
> do that thing.
>
> And because the EPT flag is "ignore guest PAT", not "honor guest PAT", but that's
> as much coincidence as it is anything else.
>
>> Since it is also controlled by other cases, e.g., kvm_arch_has_noncoherent_dma()
>> at vmx_get_mt_mask(), it can be 'may_honor_guest_pat' too?
>>
>> Therefore, why not directly use 'shadow_memtype_mask' (without the API), or some
>> naming like "ept_enabled_for_hardware".
>
> Again, after this series, KVM will *always* honor guest PAT for CPUs with self-snoop,
> i.e. KVM will *never* ignore guest PAT. But for CPUs without self-snoop (or with
> errata), KVM conditionally honors/ignores guest PAT.
>
>> Even with the code from PATCH 5/5, we still have high chance that VM has
>> non-coherent DMA?
>
> I don't follow. On CPUs with self-snoop, whether or not the VM has non-coherent
> DMA (from VFIO!) is irrelevant. If the CPU has self-snoop, then KVM can safely
> honor guest PAT at all times.

Thank you very much for the explanation.

According to my understanding of the explanation (after this series):

1. When static_cpu_has(X86_FEATURE_SELFSNOOP) == true, it is 100% to "honor
guest PAT".

2. When static_cpu_has(X86_FEATURE_SELFSNOOP) == false (and
shadow_memtype_mask), although only 50% chance (depending on where there is
non-coherent DMA), at least now it is NOT 100% (to honor guest PAT) any longer.

Due to the fact it is not 100% (to honor guest PAT) any longer, there starts the
trend (from 100% to 50%) to "ignore guest PAT", that is:
kvm_mmu_may_ignore_guest_pat().

Dongli Zhang

2024-03-14 14:48:15

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes

On Thu, Mar 14, 2024, Dongli Zhang wrote:
> On 3/12/24 10:08, Sean Christopherson wrote:
> > On Mon, Mar 11, 2024, Dongli Zhang wrote:
> >> Since it is also controlled by other cases, e.g., kvm_arch_has_noncoherent_dma()
> >> at vmx_get_mt_mask(), it can be 'may_honor_guest_pat' too?
> >>
> >> Therefore, why not directly use 'shadow_memtype_mask' (without the API), or some
> >> naming like "ept_enabled_for_hardware".
> >
> > Again, after this series, KVM will *always* honor guest PAT for CPUs with self-snoop,
> > i.e. KVM will *never* ignore guest PAT. But for CPUs without self-snoop (or with
> > errata), KVM conditionally honors/ignores guest PAT.
> >
> >> Even with the code from PATCH 5/5, we still have high chance that VM has
> >> non-coherent DMA?
> >
> > I don't follow. On CPUs with self-snoop, whether or not the VM has non-coherent
> > DMA (from VFIO!) is irrelevant. If the CPU has self-snoop, then KVM can safely
> > honor guest PAT at all times.
>
>
> Thank you very much for the explanation.
>
> According to my understanding of the explanation (after this series):
>
> 1. When static_cpu_has(X86_FEATURE_SELFSNOOP) == true, it is 100% to "honor
> guest PAT".

Yes.

> 2. When static_cpu_has(X86_FEATURE_SELFSNOOP) == false (and
> shadow_memtype_mask), although only 50% chance (depending on where there is
> non-coherent DMA), at least now it is NOT 100% (to honor guest PAT) any longer.

Yes, though I wouldn't assign a percent probability to the non-coherent DMA case.

> Due to the fact it is not 100% (to honor guest PAT) any longer, there starts the
> trend (from 100% to 50%) to "ignore guest PAT", that is:
> kvm_mmu_may_ignore_guest_pat().

Yep.