2023-10-12 20:18:24

by Uros Bizjak

[permalink] [raw]
Subject: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro

The following patch series introduces %rip-relative addressing to
PER_CPU_VAR macro. Instruction with %rip-relative address operand is
one byte shorter than its absolute address counterpart and is also
compatible with position independent executable (-fpie) build.

The first three patches are cleanups that fix various inconsistencies
throughout the assembly code.

The last patch introduces (%rip) suffix and uses it for x86_64 target,
resulting in a small code size decrease:

text data bss dec hex filename
25510677 4386685 808388 30705750 1d48856 vmlinux-new.o
25510629 4386685 808388 30705702 1d48826 vmlinux-old.o

Patch series is against current mainline and can be applied independently
of ongoing percpu work.

v2: Introduce PER_CPU_ARG macro to conditionally enable
segment registers in cmpxchg{8,16}b_emu.S for CONFIG_SMP.

Cc: Juergen Gross <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>

Uros Bizjak (4):
x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S
x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend
x86/percpu, xen: Correct PER_CPU_VAR usage to include symbol and its
addend
x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro

arch/x86/entry/calling.h | 2 +-
arch/x86/entry/entry_32.S | 2 +-
arch/x86/entry/entry_64.S | 2 +-
arch/x86/include/asm/percpu.h | 10 +++++++---
arch/x86/kernel/head_64.S | 2 +-
arch/x86/lib/cmpxchg16b_emu.S | 12 ++++++------
arch/x86/lib/cmpxchg8b_emu.S | 24 ++++++++++++++++++------
arch/x86/xen/xen-asm.S | 10 +++++-----
8 files changed, 40 insertions(+), 24 deletions(-)

--
2.41.0


2023-10-12 20:54:15

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro

On 10/12/23 13:12, Uros Bizjak wrote:
> The last patch introduces (%rip) suffix and uses it for x86_64 target,
> resulting in a small code size decrease: text data bss dec hex filename
> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> 808388 30705702 1d48826 vmlinux-old.o

I feel like I'm missing some of the motivation here.

50 bytes is great and all, but it isn't without the cost of changing
some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
confusion.

Are there some other side benefits? What else does this enable?

2023-10-12 21:00:06

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro

On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <[email protected]> wrote:
>
> On 10/12/23 13:12, Uros Bizjak wrote:
> > The last patch introduces (%rip) suffix and uses it for x86_64 target,
> > resulting in a small code size decrease: text data bss dec hex filename
> > 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> > 808388 30705702 1d48826 vmlinux-old.o
>
> I feel like I'm missing some of the motivation here.
>
> 50 bytes is great and all, but it isn't without the cost of changing
> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
> confusion.
>
> Are there some other side benefits? What else does this enable?

These changes are necessary to build the kernel as Position
Independent Executable (PIE) on x86_64 [1]. And since I was working in
percpu area I thought that it was worth implementing them.

[1] https://lore.kernel.org/lkml/[email protected]/

Uros.

2023-10-12 21:09:36

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro

On 10/12/23 13:59, Uros Bizjak wrote:
> On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <[email protected]> wrote:
>>
>> On 10/12/23 13:12, Uros Bizjak wrote:
>>> The last patch introduces (%rip) suffix and uses it for x86_64 target,
>>> resulting in a small code size decrease: text data bss dec hex filename
>>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
>>> 808388 30705702 1d48826 vmlinux-old.o
>>
>> I feel like I'm missing some of the motivation here.
>>
>> 50 bytes is great and all, but it isn't without the cost of changing
>> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
>> confusion.
>>
>> Are there some other side benefits? What else does this enable?
>
> These changes are necessary to build the kernel as Position
> Independent Executable (PIE) on x86_64 [1]. And since I was working in
> percpu area I thought that it was worth implementing them.
>
> [1] https://lore.kernel.org/lkml/[email protected]/
>

Are you PIC-adjusting the percpu variables as well?

-hpa

2023-10-12 21:17:57

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro

On Thu, Oct 12, 2023 at 11:08 PM H. Peter Anvin <[email protected]> wrote:
>
> On 10/12/23 13:59, Uros Bizjak wrote:
> > On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <[email protected]> wrote:
> >>
> >> On 10/12/23 13:12, Uros Bizjak wrote:
> >>> The last patch introduces (%rip) suffix and uses it for x86_64 target,
> >>> resulting in a small code size decrease: text data bss dec hex filename
> >>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> >>> 808388 30705702 1d48826 vmlinux-old.o
> >>
> >> I feel like I'm missing some of the motivation here.
> >>
> >> 50 bytes is great and all, but it isn't without the cost of changing
> >> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
> >> confusion.
> >>
> >> Are there some other side benefits? What else does this enable?
> >
> > These changes are necessary to build the kernel as Position
> > Independent Executable (PIE) on x86_64 [1]. And since I was working in
> > percpu area I thought that it was worth implementing them.
> >
> > [1] https://lore.kernel.org/lkml/[email protected]/
> >
>
> Are you PIC-adjusting the percpu variables as well?

After this patch (and after fixing percpu_stable_op to use "a" operand
modifier on GCC), the only *one* remaining absolute reference to
percpu variable remain in xen-head.S, where:

movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax

should be changed to use leaq.

All others should then be (%rip)-relative.

Uros.

2023-10-12 21:22:38

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro

On 10/12/23 14:17, Uros Bizjak wrote:
>>
>> Are you PIC-adjusting the percpu variables as well?
>
> After this patch (and after fixing percpu_stable_op to use "a" operand
> modifier on GCC), the only *one* remaining absolute reference to
> percpu variable remain in xen-head.S, where:
>
> movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
>
> should be changed to use leaq.
>
> All others should then be (%rip)-relative.
>

I mean, the symbols themselves are relative, not absolute?

-hpa

2023-10-12 22:45:46

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro

On Thu, Oct 12, 2023 at 11:22 PM H. Peter Anvin <[email protected]> wrote:
>
> On 10/12/23 14:17, Uros Bizjak wrote:
> >>
> >> Are you PIC-adjusting the percpu variables as well?
> >
> > After this patch (and after fixing percpu_stable_op to use "a" operand
> > modifier on GCC), the only *one* remaining absolute reference to
> > percpu variable remain in xen-head.S, where:
> >
> > movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
> >
> > should be changed to use leaq.
> >
> > All others should then be (%rip)-relative.
> >
>
> I mean, the symbols themselves are relative, not absolute?

The reference to the symbol is relative to the segment register, but
absolute to the location of the instruction. If the executable changes
location, then instruction moves around and reference is not valid
anymore. (%rip)-relative reference compensate for changed location of
the instruction.

Uros.