2023-11-15 17:19:21

by Michael Kelley

[permalink] [raw]
Subject: RE: [PATCH] x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()

From: Uros Bizjak <[email protected]> Sent: Tuesday, November 14, 2023 8:59 AM
>
> Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
> in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in
> the ZF flag, so this change saves a compare after CMPXCHG. The generated
> asm code improves from:
>
> 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> 45: b8 ff ff ff ff mov $0xffffffff,%eax
> 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> 51: 00
> 52: 83 f8 ff cmp $0xffffffff,%eax
> 55: 0f 95 c0 setne %al
>
> to:
>
> 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> 45: b8 ff ff ff ff mov $0xffffffff,%eax
> 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> 51: 00
> 52: 0f 95 c0 setne %al
>
> No functional change intended.
>
> Cc: "K. Y. Srinivasan" <[email protected]>
> Cc: Haiyang Zhang <[email protected]>
> Cc: Wei Liu <[email protected]>
> Cc: Dexuan Cui <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Signed-off-by: Uros Bizjak <[email protected]>
> ---
> arch/x86/kernel/cpu/mshyperv.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c
> b/arch/x86/kernel/cpu/mshyperv.c index e6bba12c759c..01fa06dd06b6
> 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -262,11 +262,14 @@ static uint32_t __init ms_hyperv_platform(void)
> static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs) {
> static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> + unsigned int old_cpu, this_cpu;
>
> if (!unknown_nmi_panic)
> return NMI_DONE;
>
> - if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> + old_cpu = -1;
> + this_cpu = raw_smp_processor_id();
> + if (!atomic_try_cmpxchg(&nmi_cpu, &old_cpu, this_cpu))
> return NMI_HANDLED;
>
> return NMI_DONE;
> --
> 2.41.0

The change looks correct to me. But is there any motivation other
than saving 3 bytes of generated code? This is not a performance
sensitive path. And the change adds 3 lines of source code. So
I wonder if the change is worth the churn.

In any case,

Reviewed-by: Michael Kelley <[email protected]>


2023-11-15 20:59:46

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()

On Wed, Nov 15, 2023 at 6:19 PM Michael Kelley <[email protected]> wrote:
>
> From: Uros Bizjak <[email protected]> Sent: Tuesday, November 14, 2023 8:59 AM
> >
> > Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
> > in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in
> > the ZF flag, so this change saves a compare after CMPXCHG. The generated
> > asm code improves from:
> >
> > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > 51: 00
> > 52: 83 f8 ff cmp $0xffffffff,%eax
> > 55: 0f 95 c0 setne %al
> >
> > to:
> >
> > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > 51: 00
> > 52: 0f 95 c0 setne %al
> >
> > No functional change intended.
> >
> > Cc: "K. Y. Srinivasan" <[email protected]>
> > Cc: Haiyang Zhang <[email protected]>
> > Cc: Wei Liu <[email protected]>
> > Cc: Dexuan Cui <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: "H. Peter Anvin" <[email protected]>
> > Signed-off-by: Uros Bizjak <[email protected]>
> > ---
> > arch/x86/kernel/cpu/mshyperv.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c
> > b/arch/x86/kernel/cpu/mshyperv.c index e6bba12c759c..01fa06dd06b6
> > 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -262,11 +262,14 @@ static uint32_t __init ms_hyperv_platform(void)
> > static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs) {
> > static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> > + unsigned int old_cpu, this_cpu;
> >
> > if (!unknown_nmi_panic)
> > return NMI_DONE;
> >
> > - if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> > + old_cpu = -1;
> > + this_cpu = raw_smp_processor_id();
> > + if (!atomic_try_cmpxchg(&nmi_cpu, &old_cpu, this_cpu))
> > return NMI_HANDLED;
> >
> > return NMI_DONE;
> > --
> > 2.41.0
>
> The change looks correct to me. But is there any motivation other
> than saving 3 bytes of generated code? This is not a performance
> sensitive path. And the change adds 3 lines of source code. So
> I wonder if the change is worth the churn.

Yes, I was trying to make the function more easy to understand and
similar to nmi_panic() from kernel/panic.c. I had also the idea of
using CPU_INVALID #define instead of -1, but IMO, the above works as
well.

> In any case,
>
> Reviewed-by: Michael Kelley <[email protected]>

Thanks,
Uros.

2023-11-22 03:52:26

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH] x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()

On Wed, Nov 15, 2023 at 09:58:29PM +0100, Uros Bizjak wrote:
> On Wed, Nov 15, 2023 at 6:19 PM Michael Kelley <[email protected]> wrote:
> >
> > From: Uros Bizjak <[email protected]> Sent: Tuesday, November 14, 2023 8:59 AM
> > >
> > > Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
> > > in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in
> > > the ZF flag, so this change saves a compare after CMPXCHG. The generated
> > > asm code improves from:
> > >
> > > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > > 51: 00
> > > 52: 83 f8 ff cmp $0xffffffff,%eax
> > > 55: 0f 95 c0 setne %al
> > >
> > > to:
> > >
> > > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > > 51: 00
> > > 52: 0f 95 c0 setne %al
> > >
> > > No functional change intended.
> > >
> > > Cc: "K. Y. Srinivasan" <[email protected]>
> > > Cc: Haiyang Zhang <[email protected]>
> > > Cc: Wei Liu <[email protected]>
> > > Cc: Dexuan Cui <[email protected]>
> > > Cc: Thomas Gleixner <[email protected]>
> > > Cc: Ingo Molnar <[email protected]>
> > > Cc: Borislav Petkov <[email protected]>
> > > Cc: Dave Hansen <[email protected]>
> > > Cc: "H. Peter Anvin" <[email protected]>
> > > Signed-off-by: Uros Bizjak <[email protected]>
> > > ---
> > > arch/x86/kernel/cpu/mshyperv.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kernel/cpu/mshyperv.c
> > > b/arch/x86/kernel/cpu/mshyperv.c index e6bba12c759c..01fa06dd06b6
> > > 100644
> > > --- a/arch/x86/kernel/cpu/mshyperv.c
> > > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > > @@ -262,11 +262,14 @@ static uint32_t __init ms_hyperv_platform(void)
> > > static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs) {
> > > static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> > > + unsigned int old_cpu, this_cpu;
> > >
> > > if (!unknown_nmi_panic)
> > > return NMI_DONE;
> > >
> > > - if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> > > + old_cpu = -1;
> > > + this_cpu = raw_smp_processor_id();
> > > + if (!atomic_try_cmpxchg(&nmi_cpu, &old_cpu, this_cpu))
> > > return NMI_HANDLED;
> > >
> > > return NMI_DONE;
> > > --
> > > 2.41.0
> >
> > The change looks correct to me. But is there any motivation other
> > than saving 3 bytes of generated code? This is not a performance
> > sensitive path. And the change adds 3 lines of source code. So
> > I wonder if the change is worth the churn.
>
> Yes, I was trying to make the function more easy to understand and
> similar to nmi_panic() from kernel/panic.c. I had also the idea of
> using CPU_INVALID #define instead of -1, but IMO, the above works as
> well.
>
> > In any case,
> >
> > Reviewed-by: Michael Kelley <[email protected]>

Applied to hyperv-fixes.

Uros, just so you know, DKIM verification failed when I used b4 to apply
this patch. You may want to check your email setup.

For such a simple patch I'm not worried about spoofing authorship, and I
also checked the same email address had sent similar patches before.

Thanks,
Wei.

>
> Thanks,
> Uros.

2023-11-22 12:33:04

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()

On Wed, Nov 22, 2023 at 4:52 AM Wei Liu <[email protected]> wrote:
>
> On Wed, Nov 15, 2023 at 09:58:29PM +0100, Uros Bizjak wrote:
> > On Wed, Nov 15, 2023 at 6:19 PM Michael Kelley <[email protected]> wrote:
> > >
> > > From: Uros Bizjak <[email protected]> Sent: Tuesday, November 14, 2023 8:59 AM
> > > >
> > > > Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
> > > > in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in
> > > > the ZF flag, so this change saves a compare after CMPXCHG. The generated
> > > > asm code improves from:
> > > >
> > > > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > > > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > > > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > > > 51: 00
> > > > 52: 83 f8 ff cmp $0xffffffff,%eax
> > > > 55: 0f 95 c0 setne %al
> > > >
> > > > to:
> > > >
> > > > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > > > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > > > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > > > 51: 00
> > > > 52: 0f 95 c0 setne %al
> > > >
> > > > No functional change intended.
> > > >
> > > > Cc: "K. Y. Srinivasan" <[email protected]>
> > > > Cc: Haiyang Zhang <[email protected]>
> > > > Cc: Wei Liu <[email protected]>
> > > > Cc: Dexuan Cui <[email protected]>
> > > > Cc: Thomas Gleixner <[email protected]>
> > > > Cc: Ingo Molnar <[email protected]>
> > > > Cc: Borislav Petkov <[email protected]>
> > > > Cc: Dave Hansen <[email protected]>
> > > > Cc: "H. Peter Anvin" <[email protected]>
> > > > Signed-off-by: Uros Bizjak <[email protected]>
> > > > ---
> > > > arch/x86/kernel/cpu/mshyperv.c | 5 ++++-
> > > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/x86/kernel/cpu/mshyperv.c
> > > > b/arch/x86/kernel/cpu/mshyperv.c index e6bba12c759c..01fa06dd06b6
> > > > 100644
> > > > --- a/arch/x86/kernel/cpu/mshyperv.c
> > > > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > > > @@ -262,11 +262,14 @@ static uint32_t __init ms_hyperv_platform(void)
> > > > static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs) {
> > > > static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> > > > + unsigned int old_cpu, this_cpu;
> > > >
> > > > if (!unknown_nmi_panic)
> > > > return NMI_DONE;
> > > >
> > > > - if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> > > > + old_cpu = -1;
> > > > + this_cpu = raw_smp_processor_id();
> > > > + if (!atomic_try_cmpxchg(&nmi_cpu, &old_cpu, this_cpu))
> > > > return NMI_HANDLED;
> > > >
> > > > return NMI_DONE;
> > > > --
> > > > 2.41.0
> > >
> > > The change looks correct to me. But is there any motivation other
> > > than saving 3 bytes of generated code? This is not a performance
> > > sensitive path. And the change adds 3 lines of source code. So
> > > I wonder if the change is worth the churn.
> >
> > Yes, I was trying to make the function more easy to understand and
> > similar to nmi_panic() from kernel/panic.c. I had also the idea of
> > using CPU_INVALID #define instead of -1, but IMO, the above works as
> > well.
> >
> > > In any case,
> > >
> > > Reviewed-by: Michael Kelley <[email protected]>
>
> Applied to hyperv-fixes.
>
> Uros, just so you know, DKIM verification failed when I used b4 to apply
> this patch. You may want to check your email setup.

Strange, because I didn't touch the mailer and git config for
months... and recently I have sent many patches this way without
problems.

Thanks,
Uros.

2023-11-22 12:38:52

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()

On Wed, Nov 22, 2023 at 1:31 PM Uros Bizjak <[email protected]> wrote:
>
> On Wed, Nov 22, 2023 at 4:52 AM Wei Liu <[email protected]> wrote:
> >
> > On Wed, Nov 15, 2023 at 09:58:29PM +0100, Uros Bizjak wrote:
> > > On Wed, Nov 15, 2023 at 6:19 PM Michael Kelley <[email protected]> wrote:
> > > >
> > > > From: Uros Bizjak <[email protected]> Sent: Tuesday, November 14, 2023 8:59 AM
> > > > >
> > > > > Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old
> > > > > in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in
> > > > > the ZF flag, so this change saves a compare after CMPXCHG. The generated
> > > > > asm code improves from:
> > > > >
> > > > > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > > > > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > > > > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > > > > 51: 00
> > > > > 52: 83 f8 ff cmp $0xffffffff,%eax
> > > > > 55: 0f 95 c0 setne %al
> > > > >
> > > > > to:
> > > > >
> > > > > 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx
> > > > > 45: b8 ff ff ff ff mov $0xffffffff,%eax
> > > > > 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip)
> > > > > 51: 00
> > > > > 52: 0f 95 c0 setne %al
> > > > >
> > > > > No functional change intended.
> > > > >
> > > > > Cc: "K. Y. Srinivasan" <[email protected]>
> > > > > Cc: Haiyang Zhang <[email protected]>
> > > > > Cc: Wei Liu <[email protected]>
> > > > > Cc: Dexuan Cui <[email protected]>
> > > > > Cc: Thomas Gleixner <[email protected]>
> > > > > Cc: Ingo Molnar <[email protected]>
> > > > > Cc: Borislav Petkov <[email protected]>
> > > > > Cc: Dave Hansen <[email protected]>
> > > > > Cc: "H. Peter Anvin" <[email protected]>
> > > > > Signed-off-by: Uros Bizjak <[email protected]>
> > > > > ---
> > > > > arch/x86/kernel/cpu/mshyperv.c | 5 ++++-
> > > > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/arch/x86/kernel/cpu/mshyperv.c
> > > > > b/arch/x86/kernel/cpu/mshyperv.c index e6bba12c759c..01fa06dd06b6
> > > > > 100644
> > > > > --- a/arch/x86/kernel/cpu/mshyperv.c
> > > > > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > > > > @@ -262,11 +262,14 @@ static uint32_t __init ms_hyperv_platform(void)
> > > > > static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs) {
> > > > > static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> > > > > + unsigned int old_cpu, this_cpu;
> > > > >
> > > > > if (!unknown_nmi_panic)
> > > > > return NMI_DONE;
> > > > >
> > > > > - if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> > > > > + old_cpu = -1;
> > > > > + this_cpu = raw_smp_processor_id();
> > > > > + if (!atomic_try_cmpxchg(&nmi_cpu, &old_cpu, this_cpu))
> > > > > return NMI_HANDLED;
> > > > >
> > > > > return NMI_DONE;
> > > > > --
> > > > > 2.41.0
> > > >
> > > > The change looks correct to me. But is there any motivation other
> > > > than saving 3 bytes of generated code? This is not a performance
> > > > sensitive path. And the change adds 3 lines of source code. So
> > > > I wonder if the change is worth the churn.
> > >
> > > Yes, I was trying to make the function more easy to understand and
> > > similar to nmi_panic() from kernel/panic.c. I had also the idea of
> > > using CPU_INVALID #define instead of -1, but IMO, the above works as
> > > well.
> > >
> > > > In any case,
> > > >
> > > > Reviewed-by: Michael Kelley <[email protected]>
> >
> > Applied to hyperv-fixes.
> >
> > Uros, just so you know, DKIM verification failed when I used b4 to apply
> > this patch. You may want to check your email setup.
>
> Strange, because I didn't touch the mailer and git config for
> months... and recently I have sent many patches this way without
> problems.

This one [1] checks OK, so it looks like some transient issue with gmail.

[1] https://lore.kernel.org/lkml/[email protected]/

Thanks,
Uros.

2023-11-22 16:59:53

by Konstantin Ryabitsev

[permalink] [raw]
Subject: Re: [PATCH] x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()

November 21, 2023 at 10:51 PM, "Wei Liu" <[email protected]> wrote:
> Uros, just so you know, DKIM verification failed when I used b4 to apply
> this patch. You may want to check your email setup.

This is not actually Uros's fault. Recently, Gmail started adding a forced expiration field to their DKIM signatures, via the x= field:

DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20230601; t=1699981249; x=1700586049; darn=vger.kernel.org;
^^^^^^^^^^^^^

This gives the signature an enforced validity of only 7 days. Since the original message was sent on November 14 and you're retrieving it on November 21, this causes the DKIM check to fail.

I need to figure out how to make b4 ignore the x= field, because it's not relevant for our purposes, but the library we're using for DKIM doesn't currently have any mechanism to do so. I will open an RFE with them in the hopes that we can get this implemented.

Regards,
-K