I've come across a case with a VM running on Hyper-V that doesn't get
MTRRs, but the PAT is functional. (This is a Confidential VM using
AMD's SEV-SNP encryption technology with the vTOM option.) In this
case, the changes in commit 72cbc8f04fe2 ("x86/PAT: Have pat_enabled()
properly reflect state when running on Xen") apply. pat_enabled() returns
"true", but the MTRRs are not enabled.
But with this commit, there's a problem. Consider memremap() on a RAM
region, called with MEMREMAP_WB plus MEMREMAP_DEC as the 3rd
argument. Because of the request for a decrypted mapping,
arch_memremap_can_ram_remap() returns false, and a new mapping
must be created, which is appropriate.
The following call stack results:
memremap()
arch_memremap_wb()
ioremap_cache()
__ioremap_caller()
memtype_reserve() <--- pcm is _PAGE_CACHE_MODE_WB
pat_x_mtrr_type() <-- only called after commit 72cbc8f04fe2
pat_x_mtrr_type() returns _PAGE_CACHE_MODE_UC_MINUS because
mtrr_type_lookup() fails. As a result, memremap() erroneously creates the
new mapping as uncached. This uncached mapping is causing a significant
performance problem in certain Hyper-V Confidential VM configurations.
Any thoughts on resolving this? Should memtype_reserve() be checking
both pat_enabled() *and* whether MTRRs are enabled before calling
pat_x_mtrr_type()? Or does that defeat the purpose of commit
72cbc8f04fe2 in the Xen environment?
I'm also looking at how to avoid this combination in a Hyper-V Confidential
VM, but that doesn't address underlying the flaw.
Michael
On 09.01.23 19:28, Michael Kelley (LINUX) wrote:
> I've come across a case with a VM running on Hyper-V that doesn't get
> MTRRs, but the PAT is functional. (This is a Confidential VM using
> AMD's SEV-SNP encryption technology with the vTOM option.) In this
> case, the changes in commit 72cbc8f04fe2 ("x86/PAT: Have pat_enabled()
> properly reflect state when running on Xen") apply. pat_enabled() returns
> "true", but the MTRRs are not enabled.
>
> But with this commit, there's a problem. Consider memremap() on a RAM
> region, called with MEMREMAP_WB plus MEMREMAP_DEC as the 3rd
> argument. Because of the request for a decrypted mapping,
> arch_memremap_can_ram_remap() returns false, and a new mapping
> must be created, which is appropriate.
>
> The following call stack results:
>
> memremap()
> arch_memremap_wb()
> ioremap_cache()
> __ioremap_caller()
> memtype_reserve() <--- pcm is _PAGE_CACHE_MODE_WB
> pat_x_mtrr_type() <-- only called after commit 72cbc8f04fe2
>
> pat_x_mtrr_type() returns _PAGE_CACHE_MODE_UC_MINUS because
> mtrr_type_lookup() fails. As a result, memremap() erroneously creates the
> new mapping as uncached. This uncached mapping is causing a significant
> performance problem in certain Hyper-V Confidential VM configurations.
>
> Any thoughts on resolving this? Should memtype_reserve() be checking
> both pat_enabled() *and* whether MTRRs are enabled before calling
> pat_x_mtrr_type()? Or does that defeat the purpose of commit
> 72cbc8f04fe2 in the Xen environment?
I think pat_x_mtrr_type() should return _PAGE_CACHE_MODE_UC_MINUS only if
mtrr_type_lookup() is not failing and is returning a mode other than WB.
I'll send a patch.
> I'm also looking at how to avoid this combination in a Hyper-V Confidential
> VM, but that doesn't address underlying the flaw.
Yes.
Juergen
On 10.01.23 06:47, Juergen Gross wrote:
> On 09.01.23 19:28, Michael Kelley (LINUX) wrote:
>> I've come across a case with a VM running on Hyper-V that doesn't get
>> MTRRs, but the PAT is functional. (This is a Confidential VM using
>> AMD's SEV-SNP encryption technology with the vTOM option.) In this
>> case, the changes in commit 72cbc8f04fe2 ("x86/PAT: Have pat_enabled()
>> properly reflect state when running on Xen") apply. pat_enabled() returns
>> "true", but the MTRRs are not enabled.
>>
>> But with this commit, there's a problem. Consider memremap() on a RAM
>> region, called with MEMREMAP_WB plus MEMREMAP_DEC as the 3rd
>> argument. Because of the request for a decrypted mapping,
>> arch_memremap_can_ram_remap() returns false, and a new mapping
>> must be created, which is appropriate.
>>
>> The following call stack results:
>>
>> memremap()
>> arch_memremap_wb()
>> ioremap_cache()
>> __ioremap_caller()
>> memtype_reserve() <--- pcm is _PAGE_CACHE_MODE_WB
>> pat_x_mtrr_type() <-- only called after commit 72cbc8f04fe2
>>
>> pat_x_mtrr_type() returns _PAGE_CACHE_MODE_UC_MINUS because
>> mtrr_type_lookup() fails. As a result, memremap() erroneously creates the
>> new mapping as uncached. This uncached mapping is causing a significant
>> performance problem in certain Hyper-V Confidential VM configurations.
>>
>> Any thoughts on resolving this? Should memtype_reserve() be checking
>> both pat_enabled() *and* whether MTRRs are enabled before calling
>> pat_x_mtrr_type()? Or does that defeat the purpose of commit
>> 72cbc8f04fe2 in the Xen environment?
>
> I think pat_x_mtrr_type() should return _PAGE_CACHE_MODE_UC_MINUS only if
> mtrr_type_lookup() is not failing and is returning a mode other than WB.
Another idea would be to let the mtrr_type_lookup() stub in
arch/x86/include/asm/mtrr.h return MTRR_TYPE_WRBACK, enabling to simplify
pud_set_huge() and pmd_set_huge() by removing the check for MTRR_TYPE_INVALID.
Juergen
On 10.01.2023 06:59, Juergen Gross wrote:
> On 10.01.23 06:47, Juergen Gross wrote:
>> On 09.01.23 19:28, Michael Kelley (LINUX) wrote:
>>> I've come across a case with a VM running on Hyper-V that doesn't get
>>> MTRRs, but the PAT is functional. (This is a Confidential VM using
>>> AMD's SEV-SNP encryption technology with the vTOM option.) In this
>>> case, the changes in commit 72cbc8f04fe2 ("x86/PAT: Have pat_enabled()
>>> properly reflect state when running on Xen") apply. pat_enabled() returns
>>> "true", but the MTRRs are not enabled.
>>>
>>> But with this commit, there's a problem. Consider memremap() on a RAM
>>> region, called with MEMREMAP_WB plus MEMREMAP_DEC as the 3rd
>>> argument. Because of the request for a decrypted mapping,
>>> arch_memremap_can_ram_remap() returns false, and a new mapping
>>> must be created, which is appropriate.
>>>
>>> The following call stack results:
>>>
>>> memremap()
>>> arch_memremap_wb()
>>> ioremap_cache()
>>> __ioremap_caller()
>>> memtype_reserve() <--- pcm is _PAGE_CACHE_MODE_WB
>>> pat_x_mtrr_type() <-- only called after commit 72cbc8f04fe2
>>>
>>> pat_x_mtrr_type() returns _PAGE_CACHE_MODE_UC_MINUS because
>>> mtrr_type_lookup() fails. As a result, memremap() erroneously creates the
>>> new mapping as uncached. This uncached mapping is causing a significant
>>> performance problem in certain Hyper-V Confidential VM configurations.
>>>
>>> Any thoughts on resolving this? Should memtype_reserve() be checking
>>> both pat_enabled() *and* whether MTRRs are enabled before calling
>>> pat_x_mtrr_type()? Or does that defeat the purpose of commit
>>> 72cbc8f04fe2 in the Xen environment?
>>
>> I think pat_x_mtrr_type() should return _PAGE_CACHE_MODE_UC_MINUS only if
>> mtrr_type_lookup() is not failing and is returning a mode other than WB.
I agree.
> Another idea would be to let the mtrr_type_lookup() stub in
> arch/x86/include/asm/mtrr.h return MTRR_TYPE_WRBACK, enabling to simplify
> pud_set_huge() and pmd_set_huge() by removing the check for MTRR_TYPE_INVALID.
But that has a risk of ending up misleading: When there are no MTRRs, there
simply is no default type (in the absence of inspecting other criteria).
Jan
On 10.01.23 10:38, Jan Beulich wrote:
> On 10.01.2023 06:59, Juergen Gross wrote:
>> On 10.01.23 06:47, Juergen Gross wrote:
>>> On 09.01.23 19:28, Michael Kelley (LINUX) wrote:
>>>> I've come across a case with a VM running on Hyper-V that doesn't get
>>>> MTRRs, but the PAT is functional. (This is a Confidential VM using
>>>> AMD's SEV-SNP encryption technology with the vTOM option.) In this
>>>> case, the changes in commit 72cbc8f04fe2 ("x86/PAT: Have pat_enabled()
>>>> properly reflect state when running on Xen") apply. pat_enabled() returns
>>>> "true", but the MTRRs are not enabled.
>>>>
>>>> But with this commit, there's a problem. Consider memremap() on a RAM
>>>> region, called with MEMREMAP_WB plus MEMREMAP_DEC as the 3rd
>>>> argument. Because of the request for a decrypted mapping,
>>>> arch_memremap_can_ram_remap() returns false, and a new mapping
>>>> must be created, which is appropriate.
>>>>
>>>> The following call stack results:
>>>>
>>>> memremap()
>>>> arch_memremap_wb()
>>>> ioremap_cache()
>>>> __ioremap_caller()
>>>> memtype_reserve() <--- pcm is _PAGE_CACHE_MODE_WB
>>>> pat_x_mtrr_type() <-- only called after commit 72cbc8f04fe2
>>>>
>>>> pat_x_mtrr_type() returns _PAGE_CACHE_MODE_UC_MINUS because
>>>> mtrr_type_lookup() fails. As a result, memremap() erroneously creates the
>>>> new mapping as uncached. This uncached mapping is causing a significant
>>>> performance problem in certain Hyper-V Confidential VM configurations.
>>>>
>>>> Any thoughts on resolving this? Should memtype_reserve() be checking
>>>> both pat_enabled() *and* whether MTRRs are enabled before calling
>>>> pat_x_mtrr_type()? Or does that defeat the purpose of commit
>>>> 72cbc8f04fe2 in the Xen environment?
>>>
>>> I think pat_x_mtrr_type() should return _PAGE_CACHE_MODE_UC_MINUS only if
>>> mtrr_type_lookup() is not failing and is returning a mode other than WB.
>
> I agree.
>
>> Another idea would be to let the mtrr_type_lookup() stub in
>> arch/x86/include/asm/mtrr.h return MTRR_TYPE_WRBACK, enabling to simplify
>> pud_set_huge() and pmd_set_huge() by removing the check for MTRR_TYPE_INVALID.
>
> But that has a risk of ending up misleading: When there are no MTRRs, there
> simply is no default type (in the absence of inspecting other criteria).
I've sent a patch checking for MTRR_TYPE_INVALID in pat_x_mtrr_type(). This
seemed to be a less intrusive change.
The idea to modify the stub came up as a result of looking at mtrr_type_lookup()
use cases after writing my patch. All users now take an action if the returned
type is not WB and not INVALID. So it would be a modification tailored for
today's mtrr_type_lookup() users only.
Juergen