From: Juergen Gross <[email protected]> Sent: Sunday, February 12, 2023 10:27 PM
>
> On 13.02.23 02:07, Michael Kelley (LINUX) wrote:
> > From: Juergen Gross <[email protected]> Sent: Wednesday, February 8, 2023 11:22 PM
> >>
> >> When running virtualized, MTRR access can be reduced (e.g. in Xen PV
> >> guests or when running as a SEV-SNP guest under Hyper-V). Typically
> >> the hypervisor will reset the MTRR feature in cpuid data, resulting
> >> in no MTRR memory type information being available for the kernel.
> >>
> >> This has turned out to result in problems:
> >>
> >> - Hyper-V SEV-SNP guests using uncached mappings where they shouldn't
> >> - Xen PV dom0 mapping memory as WB which should be UC- instead
> >>
> >> Solve those problems by supporting to set a fixed MTRR state,
> >> overwriting the empty state used today. In case such a state has been
> >> set, don't call get_mtrr_state() in mtrr_bp_init(). The set state
> >> will only be used by mtrr_type_lookup(), as in all other cases
> >> mtrr_enabled() is being checked, which will return false. Accept the
> >> overwrite call only in case of MTRRs being disabled in cpuid.
> >>
> >> Signed-off-by: Juergen Gross <[email protected]>
> >> ---
> >> V2:
> >> - new patch
> >> ---
> >> arch/x86/include/asm/mtrr.h | 2 ++
> >> arch/x86/kernel/cpu/mtrr/generic.c | 38 ++++++++++++++++++++++++++++++
> >> arch/x86/kernel/cpu/mtrr/mtrr.c | 9 +++++++
> >> 3 files changed, 49 insertions(+)
> >>
> >> diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
> >> index f0eeaf6e5f5f..0b8f51d683dc 100644
> >> --- a/arch/x86/include/asm/mtrr.h
> >> +++ b/arch/x86/include/asm/mtrr.h
> >> @@ -31,6 +31,8 @@
> >> */
> >> # ifdef CONFIG_MTRR
> >> void mtrr_bp_init(void);
> >> +void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var,
> >> + mtrr_type *fixed, mtrr_type def_type);
> >
> > Could you add a stub for the !CONFIG_MTRR case? Then the
> > #ifdef CONFIG_MTRR could be removed in Patch 3 of this series.
>
> I was on the edge whether to add a stub. The Xen use case strongly
> suggests that the code wants to be inside an #ifdef, while the Hyper-V
> case is so simple, that it would benefit from the stub. As there was
> another #ifdef just above the added code in mshyperv.c I believed it
> would be fine without a stub. As you seem to like it better with the
> stub, I can add it.
>

Thanks. And that other #ifdef is going away soon ...

Michael

2023-02-13 11:40:17

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH v2 2/8] x86/mtrr: support setting MTRR state for software defined MTRRs

On Thu, Feb 09, 2023 at 08:22:14AM +0100, Juergen Gross wrote:
> When running virtualized, MTRR access can be reduced (e.g. in Xen PV
> guests or when running as a SEV-SNP guest under Hyper-V). Typically
> the hypervisor will reset the MTRR feature in cpuid data, resulting
> in no MTRR memory type information being available for the kernel.
>
> This has turned out to result in problems:
>
> - Hyper-V SEV-SNP guests using uncached mappings where they shouldn't
> - Xen PV dom0 mapping memory as WB which should be UC- instead
>
> Solve those problems by supporting to set a fixed MTRR state,
> overwriting the empty state used today. In case such a state has been
> set, don't call get_mtrr_state() in mtrr_bp_init(). The set state
> will only be used by mtrr_type_lookup(), as in all other cases
> mtrr_enabled() is being checked, which will return false. Accept the
> overwrite call only in case of MTRRs being disabled in cpuid.

s/cpuid/CPUID/g

> Signed-off-by: Juergen Gross <[email protected]>
> ---
> V2:
> - new patch
> ---
> arch/x86/include/asm/mtrr.h | 2 ++
> arch/x86/kernel/cpu/mtrr/generic.c | 38 ++++++++++++++++++++++++++++++
> arch/x86/kernel/cpu/mtrr/mtrr.c | 9 +++++++
> 3 files changed, 49 insertions(+)
>
> diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
> index f0eeaf6e5f5f..0b8f51d683dc 100644
> --- a/arch/x86/include/asm/mtrr.h
> +++ b/arch/x86/include/asm/mtrr.h
> @@ -31,6 +31,8 @@
> */
> # ifdef CONFIG_MTRR
> void mtrr_bp_init(void);
> +void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var,
> + mtrr_type *fixed, mtrr_type def_type);
> extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
> extern void mtrr_save_fixed_ranges(void *);
> extern void mtrr_save_state(void);
> diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
> index ee09d359e08f..788bc16888a5 100644
> --- a/arch/x86/kernel/cpu/mtrr/generic.c
> +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> @@ -240,6 +240,44 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
> return mtrr_state.def_type;
> }
>
> +/**
> + * mtrr_overwrite_state - set fixed MTRR state

fixed only? You pass in variable too...

> + *
> + * Used to set MTRR state via different means (e.g. with data obtained from
> + * a hypervisor).
> + */
> +void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var,
> + mtrr_type *fixed, mtrr_type def_type)
> +{
> + unsigned int i;
> +
> + if (boot_cpu_has(X86_FEATURE_MTRR))

check_for_deprecated_apis: WARNING: arch/x86/kernel/cpu/mtrr/generic.c:254: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.

> + return;

So this here needs to check:

if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR) &&
!(cpu_feature_enabled(X86_FEATURE_SEV_SNP) ||
cpu_feature_enabled(X86_FEATURE_XENPV))) {
WARN_ON_ONCE(1);
return;
}

as we don't want this to be called somewhere or by something else.

The SEV_SNP flag can be used from:

https://lore.kernel.org/r/[email protected]

I'm assuming here HyperV SEV-SNP guests really do set that feature flag
(they better). We can expedite that patch ofc.

And for dom0 I *think* we use X86_FEATURE_XENPV but I leave that to you.

> +
> + if (var) {
> + if (num_var > MTRR_MAX_VAR_RANGES) {
> + pr_warn("Trying to overwrite MTRR state with %u variable entries\n",
> + num_var);

What's that check for? Sanity of callers?

> + num_var = MTRR_MAX_VAR_RANGES;
> + }
> + for (i = 0; i < num_var; i++)
> + mtrr_state.var_ranges[i] = var[i];
> + num_var_ranges = num_var;
> + }
> +
> + if (fixed) {
> + for (i = 0; i < MTRR_NUM_FIXED_RANGES; i++)

You're not doing this sanity check here, expecting that callers would
know what they're doing...

> + mtrr_state.fixed_ranges[i] = fixed[i];
> + mtrr_state.enabled |= MTRR_STATE_MTRR_FIXED_ENABLED;
> + mtrr_state.have_fixed = 1;
> + }
> +
> + mtrr_state.def_type = def_type;
> + mtrr_state.enabled |= MTRR_STATE_MTRR_ENABLED;
> +
> + mtrr_state_set = 1;
> +}

I can't say that I'm crazy about the call sites:

mtrr_overwrite_state(NULL, 0, NULL, MTRR_TYPE_WRBACK);

This looks like it wants a

mtrr_override_def_type(MTRR_TYPE_WRBACK);

instead of passing in all those nulls as params.

This:

mtrr_overwrite_state(var, reg, NULL, MTRR_TYPE_UNCACHABLE);

I guess is a bit better.

Dunno, if it is only those two callers we can say, meh, whatever, this
interface is not pretty but does the job at least. But if more users
start popping up then I guess we can do

mtrr_override_fixed()
mtrr_override_variable()
mtrr_override_def_type()

...

> /**
> * mtrr_type_lookup - look up memory type in MTRR
> *
> diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
> index 542ca5639dfd..b73fe243c7fd 100644
> --- a/arch/x86/kernel/cpu/mtrr/mtrr.c
> +++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
> @@ -668,6 +668,15 @@ void __init mtrr_bp_init(void)
> const char *why = "(not available)";
> unsigned int phys_addr;
>
> + if (mtrr_state.enabled) {

Not crazy about this either: this relies on the fragile boot ordering
where init_hypervisor_platform() runs before this so it has a chance
that mtrr_state.enabled will be already set.

Yeah, yeah, cache_bp_init() and all the MTRR BSP setup stuff happens
after it but there should at least be a comment over
init_hypervisor_platform()'s call site in setup_arch() stating that
cache_bp_init() needs to happen *after* it because <reason>.

I think we should also check

x86_hyper_type

here and not do anything if not set. As this is all HV-related muck.

Xen I guess is a bit better because that call there happens even earlier
but we need the comments to say that the ordering matters because future
reorganization could cause it to blow up and people would search
themselves crazy why in the hell it breaks...

Can Xen use x86_hyper_type() too?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-02-13 11:46:28

by Christian Kujau

[permalink] [raw]

Subject: Re: [PATCH v2 5/8] x86/mtrr: revert commit 90b926e68f50

On Mon, 13 Feb 2023, Juergen Gross wrote:
> On 10.02.23 19:59, Linux regression tracking (Thorsten Leemhuis) wrote:
> > Hi, this is your Linux kernel regression tracker.
> >
> > On 09.02.23 08:22, Juergen Gross wrote:
> > > Commit 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR disabled
> > > case") has introduced a regression with Xen.
> > >
> > > Revert the patch.
> >
> > That regression you refer to is afaics one I'm tracking[1] that was
> > introduced this cycle. That makes me wonder: could this patch be applied
> > directly to fix the issue quickly? Or are patches 1 to 4 needed as well
> > (or the whole series?) to avoid other problems?
>
> Patches 1-4 are needed, too, as otherwise the issue claimed to be fixed
> with patch 5 would show up again.

The (last?) -rc8 version was released yesterday. Would it be possible to
include at least (only) the revert in mainline so that 6.2 will be
released with a working storage configuration under Xen?

Otherwise one would have to carry around that single revert manually until
this patch series has landed in mainline, or convince all the
distributions to do so :-\

Anyway, thanks for fixing this problem, I did not expect this to be such a
complicated issue when I reported that thing :-)

Christian.
--
BOFH excuse #52:

Smell from unhygienic janitorial staff wrecked the tape heads

2023-02-13 14:07:19

On 13.02.23 16:11, Borislav Petkov wrote:
> On Mon, Feb 13, 2023 at 04:03:07PM +0100, Borislav Petkov wrote:
>>> Wouldn't !cpu_feature_enabled(X86_FEATURE_HYPERVISOR) be enough?
>>>
>>> I'm not sure we won't need that for TDX guests, too.
>>
>> See, that's the problem. I wanna have it simple too. Lemme check with
>> dhansen.
>
> He says MTRRs are enabled in TDX guests: "X86_FEATURE_MTRR is fixed to
> 1 in TDX guests."
>
> So we will have to do the more finer-grained check I guess.

Isn't the check for !X86_FEATURE_MTRR && X86_FEATURE_HYPERVISOR enough
then?

Yes, you still could construct cases where it would go wrong, but I don't
think we should over-engineer it.

Juergen

Attachments:

OpenPGP_0xB0DE9DD628BF132F.asc (3.03 kB)
OpenPGP public key OpenPGP_signature (495.00 B)
OpenPGP digital signature Download all attachments

2023-02-13 15:27:42

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCH v2 2/8] x86/mtrr: support setting MTRR state for software defined MTRRs

On 2/13/23 07:11, Borislav Petkov wrote:
> On Mon, Feb 13, 2023 at 04:03:07PM +0100, Borislav Petkov wrote:
>>> Wouldn't !cpu_feature_enabled(X86_FEATURE_HYPERVISOR) be enough?
>>>
>>> I'm not sure we won't need that for TDX guests, too.
>> See, that's the problem. I wanna have it simple too. Lemme check with
>> dhansen.
> He says MTRRs are enabled in TDX guests: "X86_FEATURE_MTRR is fixed to
> 1 in TDX guests."
>
> So we will have to do the more finer-grained check I guess.

Yes, TDX guests see MTRRs as being supported. But, the TDX module also
appears to inject a #VE for all RDMSR or WRMSR's to the MTRRs. That
makes them effectively useless.

I actually don't know what the heck TDX guests are supposed to do if
they feel like mucking with the MSRs. The architecture (CPUID) is
essentially telling them: "Sure, go ahead MTRRs are fiiiiiiine". But
the TDX module is sitting there throwing exceptions (#VE) if the guest
tries to touch MTRRs.

It sounds like there are some guest<->host ABIs on Xen to help the
guests do this. But I don't see anything in the TDX "GHCI" about it.

2023-02-13 15:36:18

by Jürgen Groß

[permalink] [raw]

Subject: Re: [PATCH v2 2/8] x86/mtrr: support setting MTRR state for software defined MTRRs

On 13.02.23 16:03, Borislav Petkov wrote:
> On Mon, Feb 13, 2023 at 03:07:07PM +0100, Juergen Gross wrote:
>> Fixed in the sense of static.
>
> Well, you can't use "fixed" to say "static" when former means something
> very specific already in MTRR land.
>
>> Wouldn't !cpu_feature_enabled(X86_FEATURE_HYPERVISOR) be enough?
>>
>> I'm not sure we won't need that for TDX guests, too.
>
> See, that's the problem. I wanna have it simple too. Lemme check with
> dhansen.
>
>> Yes, it is only relevant for PV dom0.
>
> Right, I was asking whether "PV dom0" == X86_FEATURE_XENPV?

No, you can have PV guests not being dom0.

>
> :)
>
>> The number of fixed MTRRs is not dynamic AFAIK.
>
> But nothing guarantees that the caller would pass an array "mtrr_type
> *fixed" of size MTRR_NUM_FIXED_RANGES, right?

Right.

In the end I wouldn't mind dropping the fixed MTRRs from the interface, as
they are currently not needed at all.

>
>> A single interface makes it easier to avoid multiple calls.
>>
>> In the end I'm fine with either way.
>
> Yeah, I know. Question is, how much of this functionality will be
> needed/used so that we can go all out on the interface design or we can
> do a single one and forget about it...

I'd say we go with what is needed right now. And having a single interface
makes all the sanity checking you are asking for easier.

>
>>> Can Xen use x86_hyper_type() too?
>>
>> It does.
>
> Then pls add a x86_hyper_type check too to make sure a potential move of
> this call is caught in the future.

What are you especially asking for?

With my current patches Xen PV dom0 will call mtrr_overwrite_state() before
x86_hyper_type is set, while a Hyper-V SEV-SNP guest will make the call after
it has been set. Both calls happen before cache_bp_init().

So I could move the mtrr_overwrite_state() call for Xen PV dom0 into its
init_platform() callback and check in mtrr_overwrite_state() x86_hyper_type
to be set, or I could reject a call of mtrr_overwrite_state() after the call
of cache_bp_init() has happened, or I could do both.

Juergen

Attachments:

OpenPGP_0xB0DE9DD628BF132F.asc (3.03 kB)
OpenPGP public key OpenPGP_signature (495.00 B)
OpenPGP digital signature Download all attachments

2023-02-13 15:39:07

On 13.02.23 16:40, Borislav Petkov wrote:
> On Mon, Feb 13, 2023 at 04:18:48PM +0100, Juergen Gross wrote:
>> Yes, you still could construct cases where it would go wrong, but I don't
>> think we should over-engineer it.
>
> Actually, we should allow only those for which we know they get special
> treatment for MTRRs settings and warn for all the rest.
>
> And judging by Dave's reply, I think TDX should be in that category too
> since it throws #VEs...
>

Okay, and it has MTRRs enabled (as Hyper-V SEV-SNP guests), so I shouldn't
test that, I guess (or we should disable the feature before calling the
overwrite function).

Juergen

Attachments:

OpenPGP_0xB0DE9DD628BF132F.asc (3.03 kB)
OpenPGP public key OpenPGP_signature (495.00 B)
OpenPGP digital signature Download all attachments

2023-02-13 16:25:11

OpenPGP_0xB0DE9DD628BF132F.asc (3.03 kB)
OpenPGP public key OpenPGP_signature (495.00 B)
OpenPGP digital signature Download all attachments

On 16.02.23 12:25, Borislav Petkov wrote:
> On Thu, Feb 16, 2023 at 10:32:28AM +0100, Juergen Gross wrote:
>> Is that flag _really_ meant to indicate we are running as a SEV-SNP guest?
>
> Yes.
>
>> Given that the referenced patch is part of the SEV-SNP host support series,
>> I'm inclined to suspect it won't be set for sure in HyperV SEV-SNP guests.
>
> It better be. If it is a modified guest - no matter how modified - it
> should set that flag. The vTOM thing is still being discussed.
>
>> And who is setting it for KVM SEV-SNP guests?
>
> That same patch does.

Hmm, I must be blind. I can't spot it.

I'm seeing only the feature bit #define and a call of
setup_clear_cpu_cap(X86_FEATURE_SEV_SNP) in this patch.

Or is it done by hardware or the hypervisor?

Juergen

Attachments:

OpenPGP_0xB0DE9DD628BF132F.asc (3.03 kB)
OpenPGP public key OpenPGP_signature (495.00 B)
OpenPGP digital signature Download all attachments

2023-02-16 12:29:34

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH v2 2/8] x86/mtrr: support setting MTRR state for software defined MTRRs

On Thu, Feb 16, 2023 at 01:19:22PM +0100, Juergen Gross wrote:
> Hmm, I must be blind. I can't spot it.
>
> I'm seeing only the feature bit #define and a call of
> setup_clear_cpu_cap(X86_FEATURE_SEV_SNP) in this patch.
>
> Or is it done by hardware or the hypervisor?

Correction - I meant CC_ATTR_GUEST_SEV_SNP not the CPUID feature flag.

Sorry for the confusion folks.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-02-16 16:04:58

by Michael Kelley (LINUX)

[permalink] [raw]

Subject: RE: [PATCH v2 2/8] x86/mtrr: support setting MTRR state for software defined MTRRs

From: Borislav Petkov <[email protected]> Sent: Thursday, February 16, 2023 4:29 AM
>
> On Thu, Feb 16, 2023 at 01:19:22PM +0100, Juergen Gross wrote:
> > Hmm, I must be blind. I can't spot it.
> >
> > I'm seeing only the feature bit #define and a call of
> > setup_clear_cpu_cap(X86_FEATURE_SEV_SNP) in this patch.
> >
> > Or is it done by hardware or the hypervisor?
>
> Correction - I meant CC_ATTR_GUEST_SEV_SNP not the CPUID feature flag.
>

In current upstream code, Hyper-V vTOM VMs aren't participating in
the CC_ATTR_* scheme at all, so CC_ATTR_GUEST_SEV_SNP won't be
set. Getting Hyper-V vTOM VMs integrated into that scheme is a key
part of my big patch set[1] that we're separately trying to resolve the
last issues with.

Michael

[1] https://lore.kernel.org/linux-hyperv/[email protected]/