2022-08-04 15:42:29

by Kanna Scarlet

[permalink] [raw]
Subject: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

Change mov $0, %reg with xor %reg, %reg because xor %reg, %reg is
smaller so it is good to save space

asm:
ba 00 00 00 00 movl $0x0,%edx
31 d2 xorl %edx,%edx

Suggested-by: Ammar Faizi <[email protected]>
Signed-off-by: Kanna Scarlet <[email protected]>
---
arch/x86/boot/compressed/head_64.S | 2 +-
arch/x86/boot/compressed/mem_encrypt.S | 2 +-
arch/x86/kernel/ftrace_32.S | 4 ++--
arch/x86/kernel/head_64.S | 2 +-
arch/x86/math-emu/div_Xsig.S | 2 +-
arch/x86/math-emu/reg_u_sub.S | 2 +-
6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d33f060900d2..39442e7f5993 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -666,7 +666,7 @@ SYM_CODE_START(trampoline_32bit_src)
movl %cr4, %eax
andl $X86_CR4_MCE, %eax
#else
- movl $0, %eax
+ xorl %eax, %eax
#endif

/* Enable PAE and LA57 (if required) paging modes */
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
index a73e4d783cae..d1e4d3aa8395 100644
--- a/arch/x86/boot/compressed/mem_encrypt.S
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -111,7 +111,7 @@ SYM_CODE_START(startup32_vc_handler)
cmpl $0x72, 16(%esp)
jne .Lfail

- movl $0, %eax # Request CPUID[fn].EAX
+ xorl %eax, %eax # Request CPUID[fn].EAX
movl %ebx, %edx # CPUID fn
call sev_es_req_cpuid # Call helper
testl %eax, %eax # Check return code
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index a0ed0e4a2c0c..cff7decb58be 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -171,7 +171,7 @@ SYM_CODE_START(ftrace_graph_caller)
movl 3*4(%esp), %eax
/* Even with frame pointers, fentry doesn't have one here */
lea 4*4(%esp), %edx
- movl $0, %ecx
+ xorl %ecx, %ecx
subl $MCOUNT_INSN_SIZE, %eax
call prepare_ftrace_return
popl %edx
@@ -184,7 +184,7 @@ SYM_CODE_END(ftrace_graph_caller)
return_to_handler:
pushl %eax
pushl %edx
- movl $0, %eax
+ xorl %eax, %eax
call ftrace_return_to_handler
movl %eax, %ecx
popl %edx
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d860d437631b..eeb06047e30a 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -184,7 +184,7 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
movq %cr4, %rcx
andl $X86_CR4_MCE, %ecx
#else
- movl $0, %ecx
+ xorl %ecx, %ecx
#endif

/* Enable PAE mode, PGE and LA57 */
diff --git a/arch/x86/math-emu/div_Xsig.S b/arch/x86/math-emu/div_Xsig.S
index 8c270ab415be..5767b4d23954 100644
--- a/arch/x86/math-emu/div_Xsig.S
+++ b/arch/x86/math-emu/div_Xsig.S
@@ -122,7 +122,7 @@ SYM_FUNC_START(div_Xsig)
movl XsigLL(%esi),%eax
rcrl %eax
movl %eax,FPU_accum_1
- movl $0,%eax
+ xorl %eax,%eax
rcrl %eax
movl %eax,FPU_accum_0

diff --git a/arch/x86/math-emu/reg_u_sub.S b/arch/x86/math-emu/reg_u_sub.S
index 4c900c29e4ff..130b49fa1ca2 100644
--- a/arch/x86/math-emu/reg_u_sub.S
+++ b/arch/x86/math-emu/reg_u_sub.S
@@ -212,7 +212,7 @@ L_must_be_zero:
L_shift_32:
movl %ebx,%eax
movl %edx,%ebx
- movl $0,%edx
+ xorl %edx,%edx
subw $32,EXP(%edi) /* Can get underflow here */

/* We need to shift left by 1 - 31 bits */
--
Kanna Scarlet



2022-08-04 16:00:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On Thu, Aug 04, 2022 at 03:26:55PM +0000, Kanna Scarlet wrote:
> Change mov $0, %reg with xor %reg, %reg because xor %reg, %reg is
> smaller so it is good to save space

Bonus points if you find out what other advantage

XOR reg,reg

has when it comes to clearing integer registers.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-08-04 18:40:21

by Kanna Scarlet

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On 8/4/22 10:53 PM, Borislav Petkov wrote:
> Bonus points if you find out what other advantage
>
> XOR reg,reg
>
> has when it comes to clearing integer registers.

Hello sir Borislav,

Thank you for your response. I tried to find out other advantages of
xor reg,reg on Google and found this:
https://stackoverflow.com/a/33668295/7275114

"xor (being a recognized zeroing idiom, unlike mov reg, 0) has some
obvious and some subtle advantages:

1. smaller code-size than mov reg,0. (All CPUs)
2. avoids partial-register penalties for later code.
(Intel P6-family and SnB-family).
3. doesn't use an execution unit, saving power and freeing up
execution resources. (Intel SnB-family)
4. smaller uop (no immediate data) leaves room in the uop cache-line
for nearby instructions to borrow if needed. (Intel SnB-family).
5. doesn't use up entries in the physical register file. (Intel
SnB-family (and P4) at least, possibly AMD as well since they use
a similar PRF design instead of keeping register state in the ROB
like Intel P6-family microarchitectures.)"

Should I add all in the explanation sir? I will send v2 revision
tomorrow.

We also find more files to patch with this command:

grep -rE "mov.?\s+\\$\\0\s*," arch/x86

it shows many immediate zero moves to 64-bit register in file
arch/x86/crypto/curve25519-x86_64.c, but the next instruction may depend
on the previous %rflags value, we are afraid to change this because
xor touches %rflags. We will try to change it to movl $0, %r32 to
reduce the code size.

Example cmovc needs %rflags

" adcx %1, %%r11;"
" movq %%r11, 24(%2);"

/* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */
" mov $0, %%rax;"
" cmovc %%rdx, %%rax;"

Thanks.

Regards,
--
Kanna Scarlet


2022-08-05 09:36:08

by David Laight

[permalink] [raw]
Subject: RE: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

From: Kanna Scarlet
> Sent: 04 August 2022 19:08
>
> On 8/4/22 10:53 PM, Borislav Petkov wrote:
> > Bonus points if you find out what other advantage
> >
> > XOR reg,reg
> >
> > has when it comes to clearing integer registers.
>
> Hello sir Borislav,
>
> Thank you for your response. I tried to find out other advantages of
> xor reg,reg on Google and found this:
> https://stackoverflow.com/a/33668295/7275114
>
> "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some
> obvious and some subtle advantages:
>
> 1. smaller code-size than mov reg,0. (All CPUs)
> 2. avoids partial-register penalties for later code.
> (Intel P6-family and SnB-family).
> 3. doesn't use an execution unit, saving power and freeing up
> execution resources. (Intel SnB-family)
> 4. smaller uop (no immediate data) leaves room in the uop cache-line
> for nearby instructions to borrow if needed. (Intel SnB-family).
> 5. doesn't use up entries in the physical register file. (Intel
> SnB-family (and P4) at least, possibly AMD as well since they use
> a similar PRF design instead of keeping register state in the ROB
> like Intel P6-family microarchitectures.)"

You missed one, and an additional change:

Use "xor %rax,%rax" instead of "xor %eax,%eax" to save
the 'reg' prefix.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2022-08-05 10:03:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On Thu, Aug 04, 2022 at 06:08:05PM +0000, Kanna Scarlet wrote:
> Hello sir Borislav,

Please, no "sir" - just Boris or Borislav,

> Thank you for your response. I tried to find out other advantages of
> xor reg,reg on Google and found this:
> https://stackoverflow.com/a/33668295/7275114
>
> "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some
> obvious and some subtle advantages:
>
> 1. smaller code-size than mov reg,0. (All CPUs)
> 2. avoids partial-register penalties for later code.
> (Intel P6-family and SnB-family).
> 3. doesn't use an execution unit, saving power and freeing up
> execution resources. (Intel SnB-family)
> 4. smaller uop (no immediate data) leaves room in the uop cache-line
> for nearby instructions to borrow if needed. (Intel SnB-family).
> 5. doesn't use up entries in the physical register file. (Intel
> SnB-family (and P4) at least, possibly AMD as well since they use
> a similar PRF design instead of keeping register state in the ROB
> like Intel P6-family microarchitectures.)"
>
> Should I add all in the explanation sir?

You should try to understand what this means and write the gist of it in
your own words. This is how you can learn something.

> We also find more files to patch with this command:
>
> grep -rE "mov.?\s+\\$\\0\s*," arch/x86
>
> it shows many immediate zero moves to 64-bit register in file
> arch/x86/crypto/curve25519-x86_64.c, but the next instruction may depend
> on the previous %rflags value, we are afraid to change this because
> xor touches %rflags. We will try to change it to movl $0, %r32 to
> reduce the code size.

I don't think you need to do that - you can do this one patch in order
to go through the whole process of creating and submitting a patch but
you should not go on a "let's convert everything" spree just for the
sake of it.

Because maintainers barely have time to look at patches, you don't have
to send them more when they're not really needed.

Rather, I'd suggest you go and try to fix real bugs. This has some ideas
what to do:

https://www.linux.com/news/three-ways-beginners-contribute-linux-kernel/

Looking at the kernel bugzilla and trying to understand and reproduce a
bug from there would get you a long way. And you'll learn a lot.

Also, you should peruse

https://www.kernel.org/doc/html/latest/process/index.html

which has a lot of information about how this whole community thing
works.

I sincerely hope that helps.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-08-05 10:16:36

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On Fri, Aug 05, 2022 at 09:26:02AM +0000, David Laight wrote:
> Use "xor %rax,%rax" instead of "xor %eax,%eax" to save
> the 'reg' prefix.

Also, some places explicitly use the mov variant to zero a register
without touching rflags. Please be careful to not change those.

Regards,

--
J?rg R?del
[email protected]

SUSE Software Solutions Germany GmbH
Frankenstra?e 146
90461 N?rnberg
Germany

(HRB 36809, AG N?rnberg)
Gesch?ftsf?hrer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman


2022-08-08 16:48:12

by Kanna Scarlet

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On 8/5/22 4:26 PM, David Laight wrote:
> Use "xor %rax,%rax" instead of "xor %eax,%eax" to save
> the 'reg' prefix.

hello David Laight

"xor %rax,%rax" is bigger because of rex prefix, "xor %eax,%eax" is smaller
because it doesn't need rex prefix.

asm:
0: 48 31 c0 xor %rax,%rax
3: 31 c0 xor %eax,%eax

so i think to save from rex prefix, use xor %eax,%eax instead of xor %rax,%rax.

Best regards,
--
Kanna Scarlet

2022-08-08 16:49:05

by Kanna Scarlet

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On 8/5/22 4:42 PM, Joerg Roedel wrote:
> On Fri, Aug 05, 2022 at 09:26:02AM +0000, David Laight wrote:
>> Use "xor %rax,%rax" instead of "xor %eax,%eax" to save
>> the 'reg' prefix.
>
> Also, some places explicitly use the mov variant to zero a register
> without touching rflags. Please be careful to not change those.

thank you for reminder, i will check again to make myself more sure
the patch doesn't break this %rflags dependency situation

Regards,
--
Kanna Scarlet

2022-08-08 17:09:10

by Kanna Scarlet

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On 8/5/22 4:54 PM, Borislav Petkov wrote:
> On Thu, Aug 04, 2022 at 06:08:05PM +0000, Kanna Scarlet wrote:
>> Hello sir Borislav,
>
> Please, no "sir" - just Boris or Borislav,

ok, sorry

> I don't think you need to do that - you can do this one patch in order
> to go through the whole process of creating and submitting a patch but
> you should not go on a "let's convert everything" spree just for the
> sake of it.

ok, i will try to finish the process for this one patch for learning the
submitting process. After that I will avoid touching similar small
improvement and focus on real kernel bugs/issues, i'll send v2 revision
with only commit message improvement

> Because maintainers barely have time to look at patches, you don't have
> to send them more when they're not really needed.
>
> Rather, I'd suggest you go and try to fix real bugs. This has some ideas
> what to do:
>
> https://www.linux.com/news/three-ways-beginners-contribute-linux-kernel/
>
> Looking at the kernel bugzilla and trying to understand and reproduce a
> bug from there would get you a long way. And you'll learn a lot.
>
> Also, you should peruse
>
> https://www.kernel.org/doc/html/latest/process/index.html
>
> which has a lot of information about how this whole community thing
> works.
>
> I sincerely hope that helps.
>
> Thx.

thank you for the guide, I'm following it

Regards,
--
Kanna Scarlet

2022-08-08 19:32:05

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On August 8, 2022 9:45:45 AM PDT, Kanna Scarlet <[email protected]>
wrote:
>On 8/5/22 4:42 PM, Joerg Roedel wrote:
>> On Fri, Aug 05, 2022 at 09:26:02AM +0000, David Laight wrote:
>>> Use "xor %rax,%rax" instead of "xor %eax,%eax" to save
>>> the 'reg' prefix.
>>
>> Also, some places explicitly use the mov variant to zero a register
>> without touching rflags. Please be careful to not change those.
>
>thank you for reminder, i will check again to make myself more sure
>the patch doesn't break this %rflags dependency situation
>
>Regards,

In some cases you can hoist the zeroing to avoid that (and sometimes
improve performance in the process), but be very careful in general when
messing with hand-optimized assembly code like crypto; for those pieces
of code benchmarking the change is mandatory.

2022-08-08 19:44:37

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

On August 5, 2022 2:26:02 AM PDT, David Laight <[email protected]>
wrote:
>From: Kanna Scarlet
>> Sent: 04 August 2022 19:08
>>
>> On 8/4/22 10:53 PM, Borislav Petkov wrote:
>> > Bonus points if you find out what other advantage
>> >
>> > XOR reg,reg
>> >
>> > has when it comes to clearing integer registers.
>>
>> Hello sir Borislav,
>>
>> Thank you for your response. I tried to find out other advantages of
>> xor reg,reg on Google and found this:
>> https://stackoverflow.com/a/33668295/7275114
>>
>> "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some
>> obvious and some subtle advantages:
>>
>> 1. smaller code-size than mov reg,0. (All CPUs)
>> 2. avoids partial-register penalties for later code.
>> (Intel P6-family and SnB-family).
>> 3. doesn't use an execution unit, saving power and freeing up
>> execution resources. (Intel SnB-family)
>> 4. smaller uop (no immediate data) leaves room in the uop cache-line
>> for nearby instructions to borrow if needed. (Intel SnB-family).
>> 5. doesn't use up entries in the physical register file. (Intel
>> SnB-family (and P4) at least, possibly AMD as well since they use
>> a similar PRF design instead of keeping register state in the ROB
>> like Intel P6-family microarchitectures.)"
>
>You missed one, and an additional change:
>
>Use "xor %rax,%rax" instead of "xor %eax,%eax" to save
>the 'reg' prefix.
>
> David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
>Registration No: 1397386 (Wales)
>
>

You mean the other way around...

2022-08-09 07:55:54

by David Laight

[permalink] [raw]
Subject: RE: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg

From: H. Peter Anvin
> Sent: 08 August 2022 20:00
>
> On August 5, 2022 2:26:02 AM PDT, David Laight <[email protected]>
> wrote:
> >From: Kanna Scarlet
> >> Sent: 04 August 2022 19:08
> >>
> >> On 8/4/22 10:53 PM, Borislav Petkov wrote:
> >> > Bonus points if you find out what other advantage
> >> >
> >> > XOR reg,reg
> >> >
> >> > has when it comes to clearing integer registers.
> >>
> >> Hello sir Borislav,
> >>
> >> Thank you for your response. I tried to find out other advantages of
> >> xor reg,reg on Google and found this:
> >> https://stackoverflow.com/a/33668295/7275114
> >>
> >> "xor (being a recognized zeroing idiom, unlike mov reg, 0) has some
> >> obvious and some subtle advantages:
> >>
> >> 1. smaller code-size than mov reg,0. (All CPUs)
> >> 2. avoids partial-register penalties for later code.
> >> (Intel P6-family and SnB-family).
> >> 3. doesn't use an execution unit, saving power and freeing up
> >> execution resources. (Intel SnB-family)
> >> 4. smaller uop (no immediate data) leaves room in the uop cache-line
> >> for nearby instructions to borrow if needed. (Intel SnB-family).
> >> 5. doesn't use up entries in the physical register file. (Intel
> >> SnB-family (and P4) at least, possibly AMD as well since they use
> >> a similar PRF design instead of keeping register state in the ROB
> >> like Intel P6-family microarchitectures.)"
> >
> >You missed one, and an additional change:
> >
> >Use "xor %rax,%rax" instead of "xor %eax,%eax" to save
> >the 'reg' prefix.
> >
> > David
> >
> >-
> >Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> >Registration No: 1397386 (Wales)
> >
> >
>
> You mean the other way around...

Maybe :-(
The 32bit versions are best.
Somehow the register naming convention ended up getting sort of 'backwards'.
'register' is bigger than 'extended'.
I've 'only' been writing x86 asm since 1982!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)