2022-05-10 19:36:15

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64

On Tue, May 10, 2022 at 6:55 PM Peter Zijlstra <[email protected]> wrote:
>
> On Tue, May 10, 2022 at 05:42:17PM +0200, Uros Bizjak wrote:
> > This patch adds try_cmpxchg64 to improve code around cmpxchg8b. While
> > the resulting code improvements on x86_64 are minor (a compare and a move saved),
> > the improvements on x86_32 are quite noticeable. The code improves from:
>
> What user of cmpxchg64 is this?

This is cmpxchg64 in pi_try_set_control from
arch/x86/kvm/vmx/posted_intr.c, as shown in a RFC patch [1].

There are some more opportunities for try_cmpxchg64 in KVM, namely
fast_pf_fix_direct_spte in arch/x86/kvm/mmu/mmu.c and
tdp_mmu_set_spte_atomic in arch/x86/kvm/mmu/tdp_mmu.c

[1] https://www.spinics.net/lists/kvm/msg266042.html

Uros.


2022-05-11 09:38:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64

On Tue, May 10, 2022 at 07:07:25PM +0200, Uros Bizjak wrote:
> On Tue, May 10, 2022 at 6:55 PM Peter Zijlstra <[email protected]> wrote:
> >
> > On Tue, May 10, 2022 at 05:42:17PM +0200, Uros Bizjak wrote:
> > > This patch adds try_cmpxchg64 to improve code around cmpxchg8b. While
> > > the resulting code improvements on x86_64 are minor (a compare and a move saved),
> > > the improvements on x86_32 are quite noticeable. The code improves from:
> >
> > What user of cmpxchg64 is this?
>
> This is cmpxchg64 in pi_try_set_control from
> arch/x86/kvm/vmx/posted_intr.c, as shown in a RFC patch [1].

I can't read that code, my brain is hard wired to read pi as priority
inheritance/inversion.

Still, does 32bit actually support that stuff?

> There are some more opportunities for try_cmpxchg64 in KVM, namely
> fast_pf_fix_direct_spte in arch/x86/kvm/mmu/mmu.c and
> tdp_mmu_set_spte_atomic in arch/x86/kvm/mmu/tdp_mmu.c

tdp_mmu is definitely 64bit only and as such shouldn't need to use
cmpxchg64.


Anyway, your patch looks about right, but I find it *really* hard to
care about 32bit code these days.

2022-05-11 15:25:03

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64

On Wed, May 11, 2022 at 9:54 AM Peter Zijlstra <[email protected]> wrote:
>
> On Tue, May 10, 2022 at 07:07:25PM +0200, Uros Bizjak wrote:
> > On Tue, May 10, 2022 at 6:55 PM Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Tue, May 10, 2022 at 05:42:17PM +0200, Uros Bizjak wrote:
> > > > This patch adds try_cmpxchg64 to improve code around cmpxchg8b. While
> > > > the resulting code improvements on x86_64 are minor (a compare and a move saved),
> > > > the improvements on x86_32 are quite noticeable. The code improves from:
> > >
> > > What user of cmpxchg64 is this?
> >
> > This is cmpxchg64 in pi_try_set_control from
> > arch/x86/kvm/vmx/posted_intr.c, as shown in a RFC patch [1].
>
> I can't read that code, my brain is hard wired to read pi as priority
> inheritance/inversion.
>
> Still, does 32bit actually support that stuff?

Unfortunately, it does:

kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
vmx/evmcs.o vmx/nested.o vmx/posted_intr.o

And when existing cmpxchg64 is substituted with cmpxchg, the
compilation dies for 32bits with:

error: call to ‘__cmpxchg_wrong_size’ declared with attribute error:
Bad argument size for cmpxchg

So, the majority of the patch deals with 32bits and tries to implement
the inlined insn correctly for all cases. The 64bit part is simply a
call to arch_try_cmpxchg, and the rest is auto-generated from scripts.

>
> > There are some more opportunities for try_cmpxchg64 in KVM, namely
> > fast_pf_fix_direct_spte in arch/x86/kvm/mmu/mmu.c and
> > tdp_mmu_set_spte_atomic in arch/x86/kvm/mmu/tdp_mmu.c
>
> tdp_mmu is definitely 64bit only and as such shouldn't need to use
> cmpxchg64.

Indeed.

>
> Anyway, your patch looks about right, but I find it *really* hard to
> care about 32bit code these days.

Thanks, this is also my sentiment, but I hope the patch will enable
better code and perhaps ease similar situation I have had elsewhere.

Uros.

On Wed, May 11, 2022 at 9:54 AM Peter Zijlstra <[email protected]> wrote:
>
> On Tue, May 10, 2022 at 07:07:25PM +0200, Uros Bizjak wrote:
> > On Tue, May 10, 2022 at 6:55 PM Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Tue, May 10, 2022 at 05:42:17PM +0200, Uros Bizjak wrote:
> > > > This patch adds try_cmpxchg64 to improve code around cmpxchg8b. While
> > > > the resulting code improvements on x86_64 are minor (a compare and a move saved),
> > > > the improvements on x86_32 are quite noticeable. The code improves from:
> > >
> > > What user of cmpxchg64 is this?
> >
> > This is cmpxchg64 in pi_try_set_control from
> > arch/x86/kvm/vmx/posted_intr.c, as shown in a RFC patch [1].
>
> I can't read that code, my brain is hard wired to read pi as priority
> inheritance/inversion.
>
> Still, does 32bit actually support that stuff?

Unfortunately, it does:

kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
vmx/evmcs.o vmx/nested.o vmx/posted_intr.o

And when cmpxchg64 is substituted with cmpxchg, the compilation dies
for 32bits with:

error: call to ‘__cmpxchg_wrong_size’ declared with attribute error:
Bad argument size for cmpxchg

So, the majority of the patch deals with 32bits and tries to implement
the inlined insn correctly for all cases. The 64bit part is simply a
call to arch_try_cmpxchg, and the rest is auto-generated from scripts.

>
> > There are some more opportunities for try_cmpxchg64 in KVM, namely
> > fast_pf_fix_direct_spte in arch/x86/kvm/mmu/mmu.c and
> > tdp_mmu_set_spte_atomic in arch/x86/kvm/mmu/tdp_mmu.c
>
> tdp_mmu is definitely 64bit only and as such shouldn't need to use
> cmpxchg64.

Indeed.

>
> Anyway, your patch looks about right, but I find it *really* hard to
> care about 32bit code these days.

Thanks, this is also my sentiment, but I hope the patch will enable
better code and perhaps ease a similar situation elsewhere in the
sources.

Uros.

2022-05-12 00:13:45

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64

On Wed, May 11, 2022, Uros Bizjak wrote:
> On Wed, May 11, 2022 at 9:54 AM Peter Zijlstra <[email protected]> wrote:
> > Still, does 32bit actually support that stuff?
>
> Unfortunately, it does:
>
> kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
> vmx/evmcs.o vmx/nested.o vmx/posted_intr.o
>
> And when existing cmpxchg64 is substituted with cmpxchg, the
> compilation dies for 32bits with:

...

> > Anyway, your patch looks about right, but I find it *really* hard to
> > care about 32bit code these days.
>
> Thanks, this is also my sentiment, but I hope the patch will enable
> better code and perhaps ease similar situation I have had elsewhere.

IMO, if we merge this it should be solely on the benefits to 64-bit code. Yes,
KVM still supports 32-bit kernels, but I'm fairly certain the only people that
run 32-bit KVM are KVM developers. 32-bit KVM has been completely broken for
multiple releases at least once, maybe twice, and no one ever complained.

32-bit KVM is mostly useful for testing the mess that is nested NPT; an L1
hypervsior can use 32-bit paging for NPT, so KVM needs to at least make sure it
doesn't blow up if such a hypervisor is encountered. But in terms of the performance
of 32-bit KVM, I doubt there is a person in the world that cares.

2022-05-12 11:21:56

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64

On Wed, May 11, 2022 at 6:04 PM Sean Christopherson <[email protected]> wrote:
>
> On Wed, May 11, 2022, Uros Bizjak wrote:
> > On Wed, May 11, 2022 at 9:54 AM Peter Zijlstra <[email protected]> wrote:
> > > Still, does 32bit actually support that stuff?
> >
> > Unfortunately, it does:
> >
> > kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
> > vmx/evmcs.o vmx/nested.o vmx/posted_intr.o
> >
> > And when existing cmpxchg64 is substituted with cmpxchg, the
> > compilation dies for 32bits with:
>
> ...
>
> > > Anyway, your patch looks about right, but I find it *really* hard to
> > > care about 32bit code these days.
> >
> > Thanks, this is also my sentiment, but I hope the patch will enable
> > better code and perhaps ease similar situation I have had elsewhere.
>
> IMO, if we merge this it should be solely on the benefits to 64-bit code. Yes,
> KVM still supports 32-bit kernels, but I'm fairly certain the only people that
> run 32-bit KVM are KVM developers. 32-bit KVM has been completely broken for
> multiple releases at least once, maybe twice, and no one ever complained.

Yes, the idea was to improve cmpxchg64 with the implementation of
try_cmpxchg64 for 64bit targets. However, the issue with 32bit targets
stood in the way, so the effort with 32-bit implementation was mainly
to unblock progression for 64-bit targets.

Uros.

2022-05-16 19:06:23

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64

On Wed, 2022-05-11 at 21:54 +0200, Uros Bizjak wrote:
> On Wed, May 11, 2022 at 6:04 PM Sean Christopherson <[email protected]> wrote:
> > On Wed, May 11, 2022, Uros Bizjak wrote:
> > > On Wed, May 11, 2022 at 9:54 AM Peter Zijlstra <[email protected]> wrote:
> > > > Still, does 32bit actually support that stuff?
> > >
> > > Unfortunately, it does:
> > >
> > > kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
> > > vmx/evmcs.o vmx/nested.o vmx/posted_intr.o
> > >
> > > And when existing cmpxchg64 is substituted with cmpxchg, the
> > > compilation dies for 32bits with:
> >
> > ...
> >
> > > > Anyway, your patch looks about right, but I find it *really* hard to
> > > > care about 32bit code these days.
> > >
> > > Thanks, this is also my sentiment, but I hope the patch will enable
> > > better code and perhaps ease similar situation I have had elsewhere.
> >
> > IMO, if we merge this it should be solely on the benefits to 64-bit code. Yes,
> > KVM still supports 32-bit kernels, but I'm fairly certain the only people that
> > run 32-bit KVM are KVM developers. 32-bit KVM has been completely broken for
> > multiple releases at least once, maybe twice, and no one ever complained.
>
> Yes, the idea was to improve cmpxchg64 with the implementation of
> try_cmpxchg64 for 64bit targets. However, the issue with 32bit targets
> stood in the way, so the effort with 32-bit implementation was mainly
> to unblock progression for 64-bit targets.

Would that allow tdp mmu to work on 32 bit?

Best regards,
Maxim Levitsky

>
> Uros.
>



2022-05-17 02:17:44

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] locking/atomic/x86: Introduce try_cmpxchg64

On Mon, May 16, 2022, Maxim Levitsky wrote:
> On Wed, 2022-05-11 at 21:54 +0200, Uros Bizjak wrote:
> > On Wed, May 11, 2022 at 6:04 PM Sean Christopherson <[email protected]> wrote:
> > > On Wed, May 11, 2022, Uros Bizjak wrote:
> > > > On Wed, May 11, 2022 at 9:54 AM Peter Zijlstra <[email protected]> wrote:
> > > > > Still, does 32bit actually support that stuff?
> > > >
> > > > Unfortunately, it does:
> > > >
> > > > kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
> > > > vmx/evmcs.o vmx/nested.o vmx/posted_intr.o
> > > >
> > > > And when existing cmpxchg64 is substituted with cmpxchg, the
> > > > compilation dies for 32bits with:
> > >
> > > ...
> > >
> > > > > Anyway, your patch looks about right, but I find it *really* hard to
> > > > > care about 32bit code these days.
> > > >
> > > > Thanks, this is also my sentiment, but I hope the patch will enable
> > > > better code and perhaps ease similar situation I have had elsewhere.
> > >
> > > IMO, if we merge this it should be solely on the benefits to 64-bit code. Yes,
> > > KVM still supports 32-bit kernels, but I'm fairly certain the only people that
> > > run 32-bit KVM are KVM developers. 32-bit KVM has been completely broken for
> > > multiple releases at least once, maybe twice, and no one ever complained.
> >
> > Yes, the idea was to improve cmpxchg64 with the implementation of
> > try_cmpxchg64 for 64bit targets. However, the issue with 32bit targets
> > stood in the way, so the effort with 32-bit implementation was mainly
> > to unblock progression for 64-bit targets.
>
> Would that allow tdp mmu to work on 32 bit?

From a purely technical perspective, there's nothing that prevents enabling the
TDP MMU on 32-bit kernels. The TDP MMU is 64-bit only to simplify the implementation
and to reduce the maintenance and validation costs.