LinuxLists.cc - Re: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on

2022-04-12 06:17:27

Subject: Re: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on

> On Apr 11, 2022, at 3:26 PM, Dave Hansen <[email protected]> wrote:
>
> On 4/11/22 11:01, Jon Kohler wrote:
>> static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)
>> {
>> + /*
>> + * Hardware will always abort a TSX transaction if both CPUID bits
>> + * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
>> + * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
>> + * here.
>> + */
>> + if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
>> + boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
>> + tsx_clear_cpuid();
>> + setup_clear_cpu_cap(X86_FEATURE_RTM);
>> + setup_clear_cpu_cap(X86_FEATURE_HLE);
>> + return TSX_CTRL_RTM_ALWAYS_ABORT;
>> + }
>
> I don't really like hiding the setup_clear_cpu_cap() like this. Right
> now, all of the setup_clear_cpu_cap()'s are in a single function and
> they are pretty easy to figure out.
>
> This seems like logic that deserves to be appended down to the last if()
> block of code in tsx_init() instead of squirreled away in a "get mode"
> function. Does this work?

Thanks for the review, Dave. Was trying to make the change simple
with just a cut-n-paste of existing code from one place to the other,
but I see what you’re saying. Yea, I can rework the logic as you
suggested, I’ll send out a v2 patch.

Also, while I’ve got you, I’d also like to send out a patch to simply
force abort all transactions even when tsx=on, and just be done with
TSX. Now that we’ve had the patch that introduced this functionality
I’m patching for roughly a year, combined with the microcode going
out, it seems like TSX’s numbered days have come to an end.

That could greatly simplify the kernels handling of TAA on systems
that have ARCH_CAP_TSX_CTRL_MSR.

Thoughts?

> if (tsx_ctrl_state == TSX_CTRL_DISABLE) {
> ...
> } else if (tsx_ctrl_state == TSX_CTRL_ENABLE) {
> ...
> } else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT) {
> tsx_clear_cpuid();
>
> setup_clear_cpu_cap(X86_FEATURE_RTM);
> setup_clear_cpu_cap(X86_FEATURE_HLE);
> }
>

2022-04-12 22:47:44

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on

On 4/11/22 12:35, Jon Kohler wrote:
> Also, while I’ve got you, I’d also like to send out a patch to simply
> force abort all transactions even when tsx=on, and just be done with
> TSX. Now that we’ve had the patch that introduced this functionality
> I’m patching for roughly a year, combined with the microcode going
> out, it seems like TSX’s numbered days have come to an end.

Could you elaborate a little more here? Why would we ever want to force
abort transactions that don't need to be aborted for some reason?

2022-04-12 23:09:57

by Jon Kohler

[permalink] [raw]

Subject: Re: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on

> On Apr 11, 2022, at 7:45 PM, Dave Hansen <[email protected]> wrote:
>
> On 4/11/22 12:35, Jon Kohler wrote:
>> Also, while I’ve got you, I’d also like to send out a patch to simply
>> force abort all transactions even when tsx=on, and just be done with
>> TSX. Now that we’ve had the patch that introduced this functionality
>> I’m patching for roughly a year, combined with the microcode going
>> out, it seems like TSX’s numbered days have come to an end.
>
> Could you elaborate a little more here? Why would we ever want to force
> abort transactions that don't need to be aborted for some reason?

Sure, I'm talking specifically about when users of tsx=on (or
CONFIG_X86_INTEL_TSX_MODE_ON) on X86_BUG_TAA CPU SKUs. In this situation,
TSX features are enabled, as are TAA mitigations. Using our own use case
as an example, we only do this because of legacy live migration reasons.

This is fine on Skylake (because we're signed up for MDS mitigation anyhow)
and fine on Ice Lake because TAA_NO=1; however this is wicked painful on
Cascade Lake, because MDS_NO=1 and TAA_NO=0, so we're still signed up for
TAA mitigation by default. On CLX, this hits us on host syscalls as well as
vmexits with the mds clear on every one :(

So tsx=on is this oddball for us, because if we switch to auto, we'll break
live migration for some of our customers (but TAA overhead is gone), but
if we leave tsx=on, we keep the feature enabled (but no one likely uses it)
and still have to pay the TAA tax even if a customer doesn't use it.

So my theory here is to extend the logical effort of the microcode driven
automatic disablement as well as the tsx=auto automatic disablement and
have tsx=on force abort all transactions on X86_BUG_TAA SKUs, but leave
the CPU features enumerated to maintain live migration.

This would still leave TSX totally good on Ice Lake / non-buggy systems.

If it would help, I'm working up an RFC patch, and we could discuss there?

In the mean time, I did send out a v2 patch for this series addressing your
comments.

Thanks again,
Jon

2022-04-12 23:31:45

by Dave Hansen

[permalink] [raw]

Subject: Re: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on

On 4/12/22 06:36, Jon Kohler wrote:
> So my theory here is to extend the logical effort of the microcode driven
> automatic disablement as well as the tsx=auto automatic disablement and
> have tsx=on force abort all transactions on X86_BUG_TAA SKUs, but leave
> the CPU features enumerated to maintain live migration.
>
> This would still leave TSX totally good on Ice Lake / non-buggy systems.
>
> If it would help, I'm working up an RFC patch, and we could discuss there?

Sure. But, it sounds like you really want a new tdx=something rather
than to muck with tsx=on behavior. Surely someone else will come along
and complain that we broke their TDX setup if we change its behavior.

Maybe you should just pay the one-time cost and move your whole fleet
over to tsx=off if you truly believe nobody is using it.

2022-04-12 23:45:24

by Pawan Gupta

[permalink] [raw]

Subject: Re: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on

On Tue, Apr 12, 2022 at 01:36:20PM +0000, Jon Kohler wrote:
>
>
>> On Apr 11, 2022, at 7:45 PM, Dave Hansen <[email protected]> wrote:
>>
>> On 4/11/22 12:35, Jon Kohler wrote:
>>> Also, while I’ve got you, I’d also like to send out a patch to simply
>>> force abort all transactions even when tsx=on, and just be done with
>>> TSX. Now that we’ve had the patch that introduced this functionality
>>> I’m patching for roughly a year, combined with the microcode going
>>> out, it seems like TSX’s numbered days have come to an end.
>>
>> Could you elaborate a little more here? Why would we ever want to force
>> abort transactions that don't need to be aborted for some reason?
>
>Sure, I'm talking specifically about when users of tsx=on (or
>CONFIG_X86_INTEL_TSX_MODE_ON) on X86_BUG_TAA CPU SKUs. In this situation,
>TSX features are enabled, as are TAA mitigations. Using our own use case
>as an example, we only do this because of legacy live migration reasons.
>
>This is fine on Skylake (because we're signed up for MDS mitigation anyhow)
>and fine on Ice Lake because TAA_NO=1; however this is wicked painful on
>Cascade Lake, because MDS_NO=1 and TAA_NO=0, so we're still signed up for
>TAA mitigation by default. On CLX, this hits us on host syscalls as well as
>vmexits with the mds clear on every one :(
>
>So tsx=on is this oddball for us, because if we switch to auto, we'll break
>live migration for some of our customers (but TAA overhead is gone), but
>if we leave tsx=on, we keep the feature enabled (but no one likely uses it)
>and still have to pay the TAA tax even if a customer doesn't use it.
>
>So my theory here is to extend the logical effort of the microcode driven
>automatic disablement as well as the tsx=auto automatic disablement and
>have tsx=on force abort all transactions on X86_BUG_TAA SKUs, but leave
>the CPU features enumerated to maintain live migration.

This won't help on CLX as server parts did not get the microcode driven
automatic disablement. On CLX CPUID.RTM_ALWAYS_ABORT will not be set.

What could work on CLX is TSX_CTRL_RTM_DISABLE=1 and
TSX_CTRL_CPUID_CLEAR=0. This can be done for tsx=auto or with a new mode
tsx=fake|compat. IMO, adding a new mode would be better, otherwise
tsx=auto behavior will differ depending on the kernel version.

Provided that software using TSX is following below guidance [*]:

When Intel TSX is disabled at runtime using TSX_CTRL, but the CPUID
enumeration of Intel TSX is not cleared, existing software using RTM may
see aborts for every transaction. The abort will always return a 0
status code in EAX after XBEGIN. When the software does a number of
transaction retries, it should never retry for a 0 status value, but go
to the nontransactional fall back path immediately.

Thanks,
Pawan

[*] TAA document: section -> Implications on Intel TSX software
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/intel-tsx-asynchronous-abort.html

2022-04-13 17:11:37

by Jon Kohler

[permalink] [raw]

Subject: Re: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on

> On Apr 12, 2022, at 4:40 PM, Pawan Gupta <[email protected]> wrote:
>
> On Tue, Apr 12, 2022 at 01:36:20PM +0000, Jon Kohler wrote:
>>
>>
>>> On Apr 11, 2022, at 7:45 PM, Dave Hansen <[email protected]> wrote:
>>>
>>> On 4/11/22 12:35, Jon Kohler wrote:
>>>> Also, while I’ve got you, I’d also like to send out a patch to simply
>>>> force abort all transactions even when tsx=on, and just be done with
>>>> TSX. Now that we’ve had the patch that introduced this functionality
>>>> I’m patching for roughly a year, combined with the microcode going
>>>> out, it seems like TSX’s numbered days have come to an end.
>>>
>>> Could you elaborate a little more here? Why would we ever want to force
>>> abort transactions that don't need to be aborted for some reason?
>>
>> Sure, I'm talking specifically about when users of tsx=on (or
>> CONFIG_X86_INTEL_TSX_MODE_ON) on X86_BUG_TAA CPU SKUs. In this situation,
>> TSX features are enabled, as are TAA mitigations. Using our own use case
>> as an example, we only do this because of legacy live migration reasons.
>>
>> This is fine on Skylake (because we're signed up for MDS mitigation anyhow)
>> and fine on Ice Lake because TAA_NO=1; however this is wicked painful on
>> Cascade Lake, because MDS_NO=1 and TAA_NO=0, so we're still signed up for
>> TAA mitigation by default. On CLX, this hits us on host syscalls as well as
>> vmexits with the mds clear on every one :(
>>
>> So tsx=on is this oddball for us, because if we switch to auto, we'll break
>> live migration for some of our customers (but TAA overhead is gone), but
>> if we leave tsx=on, we keep the feature enabled (but no one likely uses it)
>> and still have to pay the TAA tax even if a customer doesn't use it.
>>
>> So my theory here is to extend the logical effort of the microcode driven
>> automatic disablement as well as the tsx=auto automatic disablement and
>> have tsx=on force abort all transactions on X86_BUG_TAA SKUs, but leave
>> the CPU features enumerated to maintain live migration.
>
> This won't help on CLX as server parts did not get the microcode driven
> automatic disablement. On CLX CPUID.RTM_ALWAYS_ABORT will not be set.
>
> What could work on CLX is TSX_CTRL_RTM_DISABLE=1 and
> TSX_CTRL_CPUID_CLEAR=0. This can be done for tsx=auto or with a new mode
> tsx=fake|compat. IMO, adding a new mode would be better, otherwise
> tsx=auto behavior will differ depending on the kernel version.

Thanks for the guidance, Pawan, I appreciate it. This is exactly the
approach my other patch is taking. Need to do a bit more review and
testing and ill get the RFC out

>
> Provided that software using TSX is following below guidance [*]:
>
> When Intel TSX is disabled at runtime using TSX_CTRL, but the CPUID
> enumeration of Intel TSX is not cleared, existing software using RTM may
> see aborts for every transaction. The abort will always return a 0
> status code in EAX after XBEGIN. When the software does a number of
> transaction retries, it should never retry for a 0 status value, but go
> to the nontransactional fall back path immediately.
>
> Thanks,
> Pawan
>
> [*] TAA document: section -> Implications on Intel TSX software
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.intel.com_content_www_us_en_developer_articles_technical_software-2Dsecurity-2Dguidance_technical-2Ddocumentation_intel-2Dtsx-2Dasynchronous-2Dabort.html&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=-yy3gpUOG7W2s79bE3KTnzd9h32x038M5CkPkhFsUW22MWWzcf3SoX6An2835zrn&s=t85c0qBMosrY_UvEVGzkR4j125aGfHju3SFEEPAImpQ&e=