2022-10-19 12:08:24

by Li, Xin3

[permalink] [raw]
Subject: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

From: "H. Peter Anvin (Intel)" <[email protected]>

The LKGS instruction atomically loads a segment descriptor into the
%gs descriptor registers, *except* that %gs.base is unchanged, and the
base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly
what we want this function to do.

Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Xin Li <[email protected]>
---

Changes since v3:
* We want less ASM not more, thus keep local_irq_save/restore() inside
native_load_gs_index() (Thomas Gleixner).
* For paravirt enabled kernels, initialize pv_ops.cpu.load_gs_index to
native_lkgs (Thomas Gleixner).

Changes since V2:
* Mark DI as input and output (+D) as in V1, since the exception handler
modifies it (Brian Gerst).

Changes since V1:
* Use EX_TYPE_ZERO_REG instead of fixup code in the obsolete .fixup code
section (Peter Zijlstra).
* Add a comment that states the LKGS_DI macro will be repalced with "lkgs %di"
once the binutils support the LKGS instruction (Peter Zijlstra).
---
arch/x86/include/asm/gsseg.h | 33 +++++++++++++++++++++++++++++----
arch/x86/kernel/cpu/common.c | 1 +
2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
index d15577c39e8d..ab6a595cea70 100644
--- a/arch/x86/include/asm/gsseg.h
+++ b/arch/x86/include/asm/gsseg.h
@@ -14,17 +14,42 @@

extern asmlinkage void asm_load_gs_index(u16 selector);

+/* Replace with "lkgs %di" once binutils support LKGS instruction */
+#define LKGS_DI _ASM_BYTES(0xf2,0x0f,0x00,0xf7)
+
+static inline void native_lkgs(unsigned int selector)
+{
+ u16 sel = selector;
+ asm_inline volatile("1: " LKGS_DI
+ _ASM_EXTABLE_TYPE_REG(1b, 1b, EX_TYPE_ZERO_REG, %k[sel])
+ : [sel] "+D" (sel));
+}
+
static inline void native_load_gs_index(unsigned int selector)
{
- unsigned long flags;
+ if (cpu_feature_enabled(X86_FEATURE_LKGS)) {
+ native_lkgs(selector);
+ } else {
+ unsigned long flags;

- local_irq_save(flags);
- asm_load_gs_index(selector);
- local_irq_restore(flags);
+ local_irq_save(flags);
+ asm_load_gs_index(selector);
+ local_irq_restore(flags);
+ }
}

#endif /* CONFIG_X86_64 */

+static inline void __init lkgs_init(void)
+{
+#ifdef CONFIG_PARAVIRT_XXL
+#ifdef CONFIG_X86_64
+ if (cpu_feature_enabled(X86_FEATURE_LKGS))
+ pv_ops.cpu.load_gs_index = native_lkgs;
+#endif
+#endif
+}
+
#ifndef CONFIG_PARAVIRT_XXL

static inline void load_gs_index(unsigned int selector)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e508f239098..d6eb4f60b47d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1939,6 +1939,7 @@ void __init identify_boot_cpu(void)
setup_cr_pinning();

tsx_init();
+ lkgs_init();
}

void identify_secondary_cpu(struct cpuinfo_x86 *c)
--
2.34.1


2022-10-19 13:01:56

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

On 19.10.22 11:50, Xin Li wrote:
> From: "H. Peter Anvin (Intel)" <[email protected]>
>
> The LKGS instruction atomically loads a segment descriptor into the
> %gs descriptor registers, *except* that %gs.base is unchanged, and the
> base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly
> what we want this function to do.
>
> Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Brian Gerst <[email protected]>
> Signed-off-by: Xin Li <[email protected]>
> ---
>
> Changes since v3:
> * We want less ASM not more, thus keep local_irq_save/restore() inside
> native_load_gs_index() (Thomas Gleixner).
> * For paravirt enabled kernels, initialize pv_ops.cpu.load_gs_index to
> native_lkgs (Thomas Gleixner).
>
> Changes since V2:
> * Mark DI as input and output (+D) as in V1, since the exception handler
> modifies it (Brian Gerst).
>
> Changes since V1:
> * Use EX_TYPE_ZERO_REG instead of fixup code in the obsolete .fixup code
> section (Peter Zijlstra).
> * Add a comment that states the LKGS_DI macro will be repalced with "lkgs %di"
> once the binutils support the LKGS instruction (Peter Zijlstra).
> ---
> arch/x86/include/asm/gsseg.h | 33 +++++++++++++++++++++++++++++----
> arch/x86/kernel/cpu/common.c | 1 +
> 2 files changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
> index d15577c39e8d..ab6a595cea70 100644
> --- a/arch/x86/include/asm/gsseg.h
> +++ b/arch/x86/include/asm/gsseg.h
> @@ -14,17 +14,42 @@
>
> extern asmlinkage void asm_load_gs_index(u16 selector);
>
> +/* Replace with "lkgs %di" once binutils support LKGS instruction */
> +#define LKGS_DI _ASM_BYTES(0xf2,0x0f,0x00,0xf7)
> +
> +static inline void native_lkgs(unsigned int selector)
> +{
> + u16 sel = selector;
> + asm_inline volatile("1: " LKGS_DI
> + _ASM_EXTABLE_TYPE_REG(1b, 1b, EX_TYPE_ZERO_REG, %k[sel])
> + : [sel] "+D" (sel));
> +}
> +
> static inline void native_load_gs_index(unsigned int selector)
> {
> - unsigned long flags;
> + if (cpu_feature_enabled(X86_FEATURE_LKGS)) {
> + native_lkgs(selector);
> + } else {
> + unsigned long flags;
>
> - local_irq_save(flags);
> - asm_load_gs_index(selector);
> - local_irq_restore(flags);
> + local_irq_save(flags);
> + asm_load_gs_index(selector);
> + local_irq_restore(flags);
> + }
> }
>
> #endif /* CONFIG_X86_64 */
>
> +static inline void __init lkgs_init(void)
> +{
> +#ifdef CONFIG_PARAVIRT_XXL
> +#ifdef CONFIG_X86_64
> + if (cpu_feature_enabled(X86_FEATURE_LKGS))
> + pv_ops.cpu.load_gs_index = native_lkgs;

For this to work correctly when running as a Xen PV guest, you need to add

setup_clear_cpu_cap(X86_FEATURE_LKGS);

to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise
the Xen specific .load_gs_index vector will be overwritten.


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments

2022-10-19 18:24:06

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

> > +static inline void __init lkgs_init(void) { #ifdef
> > +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
> > + if (cpu_feature_enabled(X86_FEATURE_LKGS))
> > + pv_ops.cpu.load_gs_index = native_lkgs;
>
> For this to work correctly when running as a Xen PV guest, you need to add
>
> setup_clear_cpu_cap(X86_FEATURE_LKGS);
>
> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
> specific .load_gs_index vector will be overwritten.

Yeah, we definitely should add it to disable LKGS in a Xen PV guest.

So does it mean that the Xen PV uses a black list during feature detection?
If yes then new features are often required to be masked with an explicit
call to setup_clear_cpu_cap.

Wouldn't a white list be better?
Then the job is more just on the Xen PV side, and it can selectively enable
a new feature, sometimes with Xen PV specific handling code added.

Xin

>
>
> Juergen

2022-10-19 18:30:55

by H. Peter Anvin

[permalink] [raw]
Subject: RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

On October 19, 2022 10:45:07 AM PDT, "Li, Xin3" <[email protected]> wrote:
>> > +static inline void __init lkgs_init(void) { #ifdef
>> > +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>> > + if (cpu_feature_enabled(X86_FEATURE_LKGS))
>> > + pv_ops.cpu.load_gs_index = native_lkgs;
>>
>> For this to work correctly when running as a Xen PV guest, you need to add
>>
>> setup_clear_cpu_cap(X86_FEATURE_LKGS);
>>
>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
>> specific .load_gs_index vector will be overwritten.
>
>Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
>
>So does it mean that the Xen PV uses a black list during feature detection?
>If yes then new features are often required to be masked with an explicit
>call to setup_clear_cpu_cap.
>
>Wouldn't a white list be better?
>Then the job is more just on the Xen PV side, and it can selectively enable
>a new feature, sometimes with Xen PV specific handling code added.
>
>Xin
>
>>
>>
>> Juergen
>

Most things don't frob the paravirt list.

Maybe we should make the paravirt frobbing a separate patch, at it is separable.

2022-10-20 05:00:37

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

On 19.10.22 20:01, H. Peter Anvin wrote:
> On October 19, 2022 10:45:07 AM PDT, "Li, Xin3" <[email protected]> wrote:
>>>> +static inline void __init lkgs_init(void) { #ifdef
>>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>>>> + if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>>> + pv_ops.cpu.load_gs_index = native_lkgs;
>>>
>>> For this to work correctly when running as a Xen PV guest, you need to add
>>>
>>> setup_clear_cpu_cap(X86_FEATURE_LKGS);
>>>
>>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
>>> specific .load_gs_index vector will be overwritten.
>>
>> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
>>
>> So does it mean that the Xen PV uses a black list during feature detection?
>> If yes then new features are often required to be masked with an explicit
>> call to setup_clear_cpu_cap.
>>
>> Wouldn't a white list be better?
>> Then the job is more just on the Xen PV side, and it can selectively enable
>> a new feature, sometimes with Xen PV specific handling code added.
>>
>> Xin
>>
>>>
>>>
>>> Juergen
>>
>
> Most things don't frob the paravirt list.
>
> Maybe we should make the paravirt frobbing a separate patch, at it is separable.

Works for me.


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments

2022-10-20 05:00:37

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

On 19.10.22 19:45, Li, Xin3 wrote:
>>> +static inline void __init lkgs_init(void) { #ifdef
>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>>> + if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>> + pv_ops.cpu.load_gs_index = native_lkgs;
>>
>> For this to work correctly when running as a Xen PV guest, you need to add
>>
>> setup_clear_cpu_cap(X86_FEATURE_LKGS);
>>
>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as otherwise the Xen
>> specific .load_gs_index vector will be overwritten.
>
> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
>
> So does it mean that the Xen PV uses a black list during feature detection?
> If yes then new features are often required to be masked with an explicit
> call to setup_clear_cpu_cap.
>
> Wouldn't a white list be better?
> Then the job is more just on the Xen PV side, and it can selectively enable
> a new feature, sometimes with Xen PV specific handling code added.

This is not how it works. Feature detection is generic code, so we'd need to
tweak that for switching to a whitelist.

Additionally most features don't require any Xen PV specific handling. This is
needed for some paravirtualized privileged operations only. So switching to a
whitelist would add more effort.


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments

2022-10-20 06:12:37

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

> On 19.10.22 19:45, Li, Xin3 wrote:
> >>> +static inline void __init lkgs_init(void) { #ifdef
> >>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
> >>> + if (cpu_feature_enabled(X86_FEATURE_LKGS))
> >>> + pv_ops.cpu.load_gs_index = native_lkgs;
> >>
> >> For this to work correctly when running as a Xen PV guest, you need
> >> to add
> >>
> >> setup_clear_cpu_cap(X86_FEATURE_LKGS);
> >>
> >> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as
> >> otherwise the Xen specific .load_gs_index vector will be overwritten.
> >
> > Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
> >
> > So does it mean that the Xen PV uses a black list during feature detection?
> > If yes then new features are often required to be masked with an
> > explicit call to setup_clear_cpu_cap.
> >
> > Wouldn't a white list be better?
> > Then the job is more just on the Xen PV side, and it can selectively
> > enable a new feature, sometimes with Xen PV specific handling code added.
>
> This is not how it works. Feature detection is generic code, so we'd need to
> tweak that for switching to a whitelist.
>

Yes, a Xen PV guest is basically a Linux system. However IIRC, the Xen PV
CPUID is para-virtualized, so it's Xen hypervisor's responsibility to decide
features exposed to a Xen PV guest. No?

> Additionally most features don't require any Xen PV specific handling. This is
> needed for some paravirtualized privileged operations only. So switching to a
> whitelist would add more effort.
>

LKGS is allowed only in ring 0, thus only Xen hypervisor could use it.

Xin

>
> Juergen

2022-10-20 06:44:03

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

On 20.10.22 07:58, Li, Xin3 wrote:
>> On 19.10.22 19:45, Li, Xin3 wrote:
>>>>> +static inline void __init lkgs_init(void) { #ifdef
>>>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
>>>>> + if (cpu_feature_enabled(X86_FEATURE_LKGS))
>>>>> + pv_ops.cpu.load_gs_index = native_lkgs;
>>>>
>>>> For this to work correctly when running as a Xen PV guest, you need
>>>> to add
>>>>
>>>> setup_clear_cpu_cap(X86_FEATURE_LKGS);
>>>>
>>>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as
>>>> otherwise the Xen specific .load_gs_index vector will be overwritten.
>>>
>>> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
>>>
>>> So does it mean that the Xen PV uses a black list during feature detection?
>>> If yes then new features are often required to be masked with an
>>> explicit call to setup_clear_cpu_cap.
>>>
>>> Wouldn't a white list be better?
>>> Then the job is more just on the Xen PV side, and it can selectively
>>> enable a new feature, sometimes with Xen PV specific handling code added.
>>
>> This is not how it works. Feature detection is generic code, so we'd need to
>> tweak that for switching to a whitelist.
>>
>
> Yes, a Xen PV guest is basically a Linux system. However IIRC, the Xen PV
> CPUID is para-virtualized, so it's Xen hypervisor's responsibility to decide
> features exposed to a Xen PV guest. No?

In theory you are right, of course.

OTOH the Xen PV interface has a long and complicated history, and we have to
deal with old hypervisor versions, too.

>> Additionally most features don't require any Xen PV specific handling. This is
>> needed for some paravirtualized privileged operations only. So switching to a
>> whitelist would add more effort.
>>
>
> LKGS is allowed only in ring 0, thus only Xen hypervisor could use it.

Right, it would be one of the features where a whitelist would be nice.

OTOH today only 11 features need special handling in Xen PV guests, while
the rest of more than 300 features doesn't.


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments

2022-10-20 06:44:26

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()


> >
> > Most things don't frob the paravirt list.
> >
> > Maybe we should make the paravirt frobbing a separate patch, at it is
> separable.
>
> Works for me.

Thanks, I will send out the patch after Xen PV testing (need to setup it first).

Xin

>
>
> Juergen

2022-10-20 06:59:08

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v4 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()

> On 20.10.22 07:58, Li, Xin3 wrote:
> >> On 19.10.22 19:45, Li, Xin3 wrote:
> >>>>> +static inline void __init lkgs_init(void) { #ifdef
> >>>>> +CONFIG_PARAVIRT_XXL #ifdef CONFIG_X86_64
> >>>>> + if (cpu_feature_enabled(X86_FEATURE_LKGS))
> >>>>> + pv_ops.cpu.load_gs_index = native_lkgs;
> >>>>
> >>>> For this to work correctly when running as a Xen PV guest, you need
> >>>> to add
> >>>>
> >>>> setup_clear_cpu_cap(X86_FEATURE_LKGS);
> >>>>
> >>>> to xen_init_capabilities() in arch/x86/xen/enlighten_pv.c, as
> >>>> otherwise the Xen specific .load_gs_index vector will be overwritten.
> >>>
> >>> Yeah, we definitely should add it to disable LKGS in a Xen PV guest.
> >>>
> >>> So does it mean that the Xen PV uses a black list during feature detection?
> >>> If yes then new features are often required to be masked with an
> >>> explicit call to setup_clear_cpu_cap.
> >>>
> >>> Wouldn't a white list be better?
> >>> Then the job is more just on the Xen PV side, and it can selectively
> >>> enable a new feature, sometimes with Xen PV specific handling code
> added.
> >>
> >> This is not how it works. Feature detection is generic code, so we'd
> >> need to tweak that for switching to a whitelist.
> >>
> >
> > Yes, a Xen PV guest is basically a Linux system. However IIRC, the
> > Xen PV CPUID is para-virtualized, so it's Xen hypervisor's
> > responsibility to decide features exposed to a Xen PV guest. No?
>
> In theory you are right, of course.
>
> OTOH the Xen PV interface has a long and complicated history, and we have to
> deal with old hypervisor versions, too.
>
> >> Additionally most features don't require any Xen PV specific
> >> handling. This is needed for some paravirtualized privileged
> >> operations only. So switching to a whitelist would add more effort.
> >>
> >
> > LKGS is allowed only in ring 0, thus only Xen hypervisor could use it.
>
> Right, it would be one of the features where a whitelist would be nice.
>
> OTOH today only 11 features need special handling in Xen PV guests, while the
> rest of more than 300 features doesn't.
>

Got to say, nothing is more convincing than strong data.
Xin

>
> Juergen