2023-01-10 06:04:07

by Chen, Yian

[permalink] [raw]
Subject: [PATCH 0/7] Enable LASS (Linear Address space Separation)

LASS (Linear Address Space Separation) is a security
extension that prevents speculative address accesses across
user/kernel mode. The LASS details have been published in
Chapter 11 in
https://cdrdv2.intel.com/v1/dl/getContent/671368

LASS works in 64-bit mode only and partitions the 64-bit
virtual address space into two halves:
1. Lower half (LA[63]=0) --> user space
2. Upper half (LA[63]=1) --> kernel space
When LASS is enabled, a general protection #GP(0) fault will
be generated if software accesses the address from the half in
which it resides to another half, e.g., either from user space
to upper half, or from kernel space to lower half. This
protection applies to data access, code execution, cache line
flushing instructions.

Almost all kernel accesses are to the upper half of the virtual
address space. However, there are valid reasons for kernel to
access the lower half. For these cases, kernel can temporarily
suspend the enforcement of LASS by disabling SMAP (Supervisor
Mode Access Prevention).

Kernel access to copy data to/from user addresses already
disables SMAP using the stac()/clac() functions. New functions
low_addr_access_begin()/low_addr_access_end() are added to
also disable/enable SMAP around other code that legitimately
needs to access the lower half of the virtual address space.

User space cannot use any kernel address while LASS is
enabled. Less fortunately, legacy vsyscall functions used
by old version of glibc are located in the address range
0xffffffffff600000-0xffffffffff601000 and emulated in kernel.
Therefore, to comply with LASS policy, the legacy vsyscall is
disabled by default. I am looking for input from Andy and
others if this approach is acceptable.

This patch set by default enforces LASS when the platform
supports it. It can be disabled via the command line parameter
"clearcpuid" or by setting "vsyscall=emulate/xonly".

As of now there is no publicly available CPU supporting LASS.
The first one to support LASS is Sierra Forest line. The Intel
Simics® Simulator was used as software development and testing
vehicle for this patch set.

Paul Lai (1):
x86/kvm: Expose LASS feature to VM guest

Yian Chen (6):
x86/cpu: Enumerate LASS CPUID and CR4 bits
x86: Add CONFIG option X86_LASS
x86/cpu: Disable kernel LASS when patching kernel alternatives
x86/vsyscall: Setup vsyscall to compromise LASS protection
x86/cpu: Enable LASS (Linear Address Space Separation)
x86/cpu: Set LASS as pinning sensitive CR4 bit

.../admin-guide/kernel-parameters.txt | 12 +++++++----
arch/x86/Kconfig | 10 +++++++++
arch/x86/entry/vsyscall/vsyscall_64.c | 14 +++++++++++++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/disabled-features.h | 8 ++++++-
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/include/asm/smap.h | 13 ++++++++++++
arch/x86/include/uapi/asm/processor-flags.h | 2 ++
arch/x86/kernel/Makefile | 2 ++
arch/x86/kernel/alternative.c | 21 +++++++++++++++++--
arch/x86/kernel/cpu/common.c | 20 +++++++++++++++++-
arch/x86/kvm/cpuid.c | 2 +-
tools/arch/x86/include/asm/cpufeatures.h | 1 +
.../arch/x86/include/asm/disabled-features.h | 8 ++++++-
tools/objtool/arch/x86/special.c | 2 ++
15 files changed, 108 insertions(+), 11 deletions(-)

--
2.34.1


2023-01-10 06:04:40

by Chen, Yian

[permalink] [raw]
Subject: [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest

From: Paul Lai <[email protected]>

Expose LASS feature which is defined in the CPUID.7.1.EAX
bit and enabled via the CR4 bit for VM guest.

Signed-off-by: Paul Lai <[email protected]>
Signed-off-by: Yian Chen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/cpuid.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f35f1ff4427b..bd39f45e9b5a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,8 @@
| X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
| X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+ | X86_CR4_LASS))

#define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b14653b61470..e0f53f85f5ae 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)

kvm_cpu_cap_mask(CPUID_7_1_EAX,
F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
- F(AVX_IFMA)
+ F(AVX_IFMA) | F(LASS)
);

kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
--
2.34.1

2023-01-10 06:18:36

by Chen, Yian

[permalink] [raw]
Subject: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection

Kernel enables LASS automatically at starting time in LASS
capable platforms. Any access to kernel addresses
or upper half addresses from user space triggers a #GP
fault.

Legacy vsyscall does not comply with LASS, because
the vsyscall functions are mapped in the range
0xffffffffff600000-0xffffffffff601000.

In theory, it would be possible to write a #GP fault handler
to emulate the old vsyscall behavior, but vsyscall has been
deprecated for some time, so this has not been done.

Therefore, when kernel enforces LASS, vsyscall does not work
and should be disabled. On the other hand, the user can relax
the enforcement by clearing lass cpu id (clearcpuid=lass/390)
or enabling vsyscall (vsyscall=xxx) from kernel command line.
The user can also opt-out LASS in config file to build kernel
binary.

Signed-off-by: Yian Chen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
arch/x86/entry/vsyscall/vsyscall_64.c | 14 ++++++++++++++
2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6cfa6e3996cf..3988e0c8c175 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6755,10 +6755,14 @@
versions of glibc use these calls. Because these
functions are at fixed addresses, they make nice
targets for exploits that can control RIP.
-
- emulate [default] Vsyscalls turn into traps and are
- emulated reasonably safely. The vsyscall
- page is readable.
+ In newer versions of Intel platforms that come with
+ LASS(Linear Address Space separation) protection,
+ vsyscall is disabled by default. Enabling vsyscall
+ via the parameter overrides LASS protection.
+
+ emulate [default if not LASS capable] Vsyscalls
+ turn into traps and are emulated reasonably
+ safely. The vsyscall page is readable.

xonly Vsyscalls turn into traps and are
emulated reasonably safely. The vsyscall
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 4af81df133ee..2691f26835d1 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
else
return -EINVAL;

+ if (cpu_feature_enabled(X86_FEATURE_LASS) &&
+ vsyscall_mode != NONE) {
+ setup_clear_cpu_cap(X86_FEATURE_LASS);
+ pr_warn("LASS disabled by command line enabling vsyscall\n");
+ }
+
return 0;
}

@@ -379,6 +385,14 @@ void __init map_vsyscall(void)
extern char __vsyscall_page;
unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);

+ /*
+ * When LASS is on, vsyscall triggers a #GP fault,
+ * so that force vsyscall_mode to NONE.
+ */
+ if (cpu_feature_enabled(X86_FEATURE_LASS)) {
+ vsyscall_mode = NONE;
+ return;
+ }
/*
* For full emulation, the page needs to exist for real. In
* execute-only mode, there is no PTE at all backing the vsyscall
--
2.34.1

2023-01-10 06:29:30

by Chen, Yian

[permalink] [raw]
Subject: [PATCH 6/7] x86/cpu: Set LASS as pinning sensitive CR4 bit

Security protection features are pinning sensitive.
LASS comes with an effort for security concerns.
Therefore, add it to the set of pinning sensitive
bits

Signed-off-by: Yian Chen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/kernel/cpu/common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index efc7c7623968..e224cbaf7866 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -432,7 +432,7 @@ static __always_inline void setup_lass(struct cpuinfo_x86 *c)
/* These bits should not change their value after CPU init is finished. */
static const unsigned long cr4_pinned_mask =
X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
- X86_CR4_FSGSBASE | X86_CR4_CET;
+ X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_LASS;
static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
static unsigned long cr4_pinned_bits __ro_after_init;

--
2.34.1

2023-01-10 19:57:38

by Sohil Mehta

[permalink] [raw]
Subject: Re: [PATCH 0/7] Enable LASS (Linear Address space Separation)

Yian,

I added a few missing lists pertaining to KVM, MM and documentation
since these patches impact them.

In future, scripts/get_maintainer.pl can help you with generating the
relevant maintainers and lists.

On 1/9/2023 9:51 PM, Yian Chen wrote:
> LASS (Linear Address Space Separation) is a security
> extension that prevents speculative address accesses across
> user/kernel mode. The LASS details have been published in
> Chapter 11 in
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> LASS works in 64-bit mode only and partitions the 64-bit
> virtual address space into two halves:
> 1. Lower half (LA[63]=0) --> user space
> 2. Upper half (LA[63]=1) --> kernel space
> When LASS is enabled, a general protection #GP(0) fault will
> be generated if software accesses the address from the half in
> which it resides to another half, e.g., either from user space
> to upper half, or from kernel space to lower half. This
> protection applies to data access, code execution, cache line
> flushing instructions.
>
> Almost all kernel accesses are to the upper half of the virtual
> address space. However, there are valid reasons for kernel to
> access the lower half. For these cases, kernel can temporarily
> suspend the enforcement of LASS by disabling SMAP (Supervisor
> Mode Access Prevention).
>
> Kernel access to copy data to/from user addresses already
> disables SMAP using the stac()/clac() functions. New functions
> low_addr_access_begin()/low_addr_access_end() are added to
> also disable/enable SMAP around other code that legitimately
> needs to access the lower half of the virtual address space.
>
> User space cannot use any kernel address while LASS is
> enabled. Less fortunately, legacy vsyscall functions used
> by old version of glibc are located in the address range
> 0xffffffffff600000-0xffffffffff601000 and emulated in kernel.
> Therefore, to comply with LASS policy, the legacy vsyscall is
> disabled by default. I am looking for input from Andy and
> others if this approach is acceptable.
>
> This patch set by default enforces LASS when the platform
> supports it. It can be disabled via the command line parameter
> "clearcpuid" or by setting "vsyscall=emulate/xonly".
>
> As of now there is no publicly available CPU supporting LASS.
> The first one to support LASS is Sierra Forest line. The Intel
> Simics® Simulator was used as software development and testing
> vehicle for this patch set.
>
> Paul Lai (1):
> x86/kvm: Expose LASS feature to VM guest
>
> Yian Chen (6):
> x86/cpu: Enumerate LASS CPUID and CR4 bits
> x86: Add CONFIG option X86_LASS
> x86/cpu: Disable kernel LASS when patching kernel alternatives
> x86/vsyscall: Setup vsyscall to compromise LASS protection
> x86/cpu: Enable LASS (Linear Address Space Separation)
> x86/cpu: Set LASS as pinning sensitive CR4 bit
>

It's usually good practice to include a base-commit to make it easier to
apply these patches.

> .../admin-guide/kernel-parameters.txt | 12 +++++++----
> arch/x86/Kconfig | 10 +++++++++
> arch/x86/entry/vsyscall/vsyscall_64.c | 14 +++++++++++++
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/disabled-features.h | 8 ++++++-
> arch/x86/include/asm/kvm_host.h | 3 ++-
> arch/x86/include/asm/smap.h | 13 ++++++++++++
> arch/x86/include/uapi/asm/processor-flags.h | 2 ++
> arch/x86/kernel/Makefile | 2 ++
> arch/x86/kernel/alternative.c | 21 +++++++++++++++++--
> arch/x86/kernel/cpu/common.c | 20 +++++++++++++++++-
> arch/x86/kvm/cpuid.c | 2 +-
> tools/arch/x86/include/asm/cpufeatures.h | 1 +
> .../arch/x86/include/asm/disabled-features.h | 8 ++++++-
> tools/objtool/arch/x86/special.c | 2 ++
> 15 files changed, 108 insertions(+), 11 deletions(-)
>

2023-01-10 23:11:49

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 0/7] Enable LASS (Linear Address space Separation)

On 1/9/23 21:51, Yian Chen wrote:
> LASS (Linear Address Space Separation) is a security
> extension that prevents speculative address accesses across
> user/kernel mode. The LASS details have been published in
> Chapter 11 in
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> LASS works in 64-bit mode only and partitions the 64-bit
> virtual address space into two halves:
> 1. Lower half (LA[63]=0) --> user space
> 2. Upper half (LA[63]=1) --> kernel space
> When LASS is enabled, a general protection #GP(0) fault will
> be generated if software accesses the address from the half in
> which it resides to another half, e.g., either from user space
> to upper half, or from kernel space to lower half. This
> protection applies to data access, code execution, cache line
> flushing instructions.

This does a good job of explaining the nuts and bolts -- *what* LASS
does. It does a less good job of explaining why this was built, how it
can benefit end users and who cares about it.

LASS seemed really cool when we were reeling from Meltdown. It would
*obviously* have been a godsend five years ago. But, it's less clear
what role it plays today and how important it is.

Could you enlighten us, please?

2023-01-11 00:54:14

by Sohil Mehta

[permalink] [raw]
Subject: Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection

On 1/9/2023 9:52 PM, Yian Chen wrote:

> The user can also opt-out LASS in config file to build kernel
> binary.

This line is unnecessary.

> Signed-off-by: Yian Chen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
> arch/x86/entry/vsyscall/vsyscall_64.c | 14 ++++++++++++++
> 2 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 6cfa6e3996cf..3988e0c8c175 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6755,10 +6755,14 @@
> versions of glibc use these calls. Because these
> functions are at fixed addresses, they make nice
> targets for exploits that can control RIP.
> -
> - emulate [default] Vsyscalls turn into traps and are
> - emulated reasonably safely. The vsyscall
> - page is readable.

The existing documentation here is incorrect. The default vsyscall mode
is actually xonly. This has been so since:
commit 625b7b7f79c6 (x86/vsyscall: Change the default vsyscall mode to
xonly)

> + In newer versions of Intel platforms that come with

Words such as "newer" in the kernel start losing meaning very quickly.
Also, this comment looks out of place in between the vsyscall sub-options.

> + LASS(Linear Address Space separation) protection,
> + vsyscall is disabled by default. Enabling vsyscall
> + via the parameter overrides LASS protection.
> +


IIUC, you are making the default mode somewhat dynamic.

vsyscall = xonly (if LASS is not enabled)
vsyscall = none (if LASS is enabled)

The decision to disable vsyscall doesn't happen at compile time but it
is taken at runtime when the LASS feature is detected. This would make
the system behavior highly platform dependent.

Instead of doing this dance, can we provide a simplified behavior to the
user/admin and move the decision making to compile time?

Option 1: A bigger hammer
Set the default vsyscall option as CONFIG_LEGACY_VSYSCALL_NONE.
CONFIG_X86_LASS is set by default. Changing the compile time VSYSCALL
option would disable LASS.

Option 2: A smaller hammer
CONFIG_X86_LASS is off by default. Vsyscall default stays as
CONFIG_LEGACY_VSYSCALL_XONLY. Selecting LASS automatically chooses
CONFIG_LEGACY_VSYSCALL_NONE. In this case, even if the hardware doesn't
support LASS, vsyscall would still remain disabled.

In both of these cases, any command line input to override the vsyscall
behavior can disable LASS as well.


> + emulate [default if not LASS capable] Vsyscalls
> + turn into traps and are emulated reasonably
> + safely. The vsyscall page is readable.
>
> xonly Vsyscalls turn into traps and are
> emulated reasonably safely. The vsyscall
> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
> index 4af81df133ee..2691f26835d1 100644
> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> @@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
> else
> return -EINVAL;
>
> + if (cpu_feature_enabled(X86_FEATURE_LASS) &&
> + vsyscall_mode != NONE) {
> + setup_clear_cpu_cap(X86_FEATURE_LASS);
> + pr_warn("LASS disabled by command line enabling vsyscall\n");

A warning seems too drastic here. A pr_info() should probably suffice.

> + }
> +
> return 0;
> }
>
> @@ -379,6 +385,14 @@ void __init map_vsyscall(void)
> extern char __vsyscall_page;
> unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
>
> + /*
> + * When LASS is on, vsyscall triggers a #GP fault,
> + * so that force vsyscall_mode to NONE.
> + */

This comment doesn't make much sense nor does it provide the full
picture. Some of the reasoning from the cover letter/commit log can be
duplicated here.

> + if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> + vsyscall_mode = NONE;
> + return;
> + }
> /*
> * For full emulation, the page needs to exist for real. In
> * execute-only mode, there is no PTE at all backing the vsyscall

2023-01-12 02:30:12

by Chen, Yian

[permalink] [raw]
Subject: Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection



On 1/10/2023 4:34 PM, Sohil Mehta wrote:
> On 1/9/2023 9:52 PM, Yian Chen wrote:
>
>> The user can also opt-out LASS in config file to build kernel
>> binary.
>
> This line is unnecessary.
>
Sure, this line can be dropped.

>> Signed-off-by: Yian Chen <[email protected]>
>> Reviewed-by: Tony Luck <[email protected]>
>> ---
>>   Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
>>   arch/x86/entry/vsyscall/vsyscall_64.c           | 14 ++++++++++++++
>>   2 files changed, 22 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index 6cfa6e3996cf..3988e0c8c175 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -6755,10 +6755,14 @@
>>               versions of glibc use these calls.  Because these
>>               functions are at fixed addresses, they make nice
>>               targets for exploits that can control RIP.
>> -
>> -            emulate     [default] Vsyscalls turn into traps and are
>> -                        emulated reasonably safely.  The vsyscall
>> -                    page is readable.
>
> The existing documentation here is incorrect. The default vsyscall mode
> is actually xonly. This has been so since:
> commit 625b7b7f79c6 (x86/vsyscall: Change the default vsyscall mode to
> xonly)
>
Yes, you are right. but this patch can overwrite and correct existing
one. I am assuming we don't need to correct the existing document first
before update it for LASS.

>> +            In newer versions of Intel platforms that come with
>
> Words such as "newer" in the kernel start losing meaning very quickly.
> Also, this comment looks out of place in between the vsyscall sub-options.
>
>> +            LASS(Linear Address Space separation) protection,
>> +            vsyscall is disabled by default. Enabling vsyscall
>> +            via the parameter overrides LASS protection.
>> +
Sure, I will take out this part change.
>
>
> IIUC, you are making the default mode somewhat dynamic.
>
> vsyscall = xonly (if LASS is not enabled)
> vsyscall = none (if LASS is enabled)
>
Yes, this looks better.

> The decision to disable vsyscall doesn't happen at compile time but it
> is taken at runtime when the LASS feature is detected. This would make
> the system behavior highly platform dependent.
>
> Instead of doing this dance, can we provide a simplified behavior to the
> user/admin and move the decision making to compile time?
>
Current strategy is to disable vsyscall by default only for LASS capable
platforms. So that the dynamic decision is likely a necessary.

> Option 1: A bigger hammer
> Set the default vsyscall option as CONFIG_LEGACY_VSYSCALL_NONE.
> CONFIG_X86_LASS is set by default. Changing the compile time VSYSCALL
> option would disable LASS.
>
This means to disable vsyscall by default for all platforms, doen't
matter LASS. I am not sure if we want to go that far.

> Option 2: A smaller hammer
> CONFIG_X86_LASS is off by default. Vsyscall default stays as
> CONFIG_LEGACY_VSYSCALL_XONLY. Selecting LASS automatically chooses
> CONFIG_LEGACY_VSYSCALL_NONE. In this case, even if the hardware doesn't
> support LASS, vsyscall would still remain disabled.
>
This turns out to disable LASS by default. Then the LASS may not be
taken in the first place.

> In both of these cases, any command line input to override the vsyscall
> behavior can disable LASS as well.
>
>
>> +            emulate     [default if not LASS capable] Vsyscalls
>> +                    turn into traps and are emulated reasonably
>> +                    safely.  The vsyscall page is readable.
>>               xonly       Vsyscalls turn into traps and are
>>                           emulated reasonably safely.  The vsyscall
>> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c
>> b/arch/x86/entry/vsyscall/vsyscall_64.c
>> index 4af81df133ee..2691f26835d1 100644
>> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
>> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
>> @@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
>>           else
>>               return -EINVAL;
>> +        if (cpu_feature_enabled(X86_FEATURE_LASS) &&
>> +            vsyscall_mode != NONE) {
>> +            setup_clear_cpu_cap(X86_FEATURE_LASS);
>> +            pr_warn("LASS disabled by command line enabling
>> vsyscall\n");
>
> A warning seems too drastic here. A pr_info() should probably suffice.
>
sure I will modify it to use pr_info.

>> +        }
>> +
>>           return 0;
>>       }
>> @@ -379,6 +385,14 @@ void __init map_vsyscall(void)
>>       extern char __vsyscall_page;
>>       unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
>> +    /*
>> +     * When LASS is on, vsyscall triggers a #GP fault,
>> +     * so that force vsyscall_mode to NONE.
>> +     */
>
> This comment doesn't make much sense nor does it provide the full
> picture. Some of the reasoning from the cover letter/commit log can be
> duplicated here.
>
sure, How about a more detail inline comment as following:
+ /*
+ * When LASS protection is on, user space vsyscall triggers
+ * a #GP fault since the vsyscall page is
+ * 0xffffffffff600000-0xffffffffff601000,
+ * so that force vsyscall_mode to NONE and disable this mapping.
+ */

>> +    if (cpu_feature_enabled(X86_FEATURE_LASS)) {
>> +        vsyscall_mode = NONE;
>> +        return;
>> +    }
>>       /*
>>        * For full emulation, the page needs to exist for real.  In
>>        * execute-only mode, there is no PTE at all backing the vsyscall
>
Thanks,
Yian

2023-01-12 03:02:36

by Sohil Mehta

[permalink] [raw]
Subject: Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection

>> The existing documentation here is incorrect. The default vsyscall
>> mode is actually xonly. This has been so since:
>> commit 625b7b7f79c6 (x86/vsyscall: Change the default vsyscall mode to
>> xonly)
>>
> Yes, you are right. but this patch can overwrite and correct existing
> one. I am assuming we don't need to correct the existing document first
> before update it for LASS.
>

We should fix this independent of the LASS enabling. I sent a patch
earlier today to address it. I apologize, I missed cc'ing you.

https://lore.kernel.org/lkml/[email protected]/

>>> +            In newer versions of Intel platforms that come with
>>
>> Words such as "newer" in the kernel start losing meaning very quickly.
>> Also, this comment looks out of place in between the vsyscall
>> sub-options.
>>
>>> +            LASS(Linear Address Space separation) protection,
>>> +            vsyscall is disabled by default. Enabling vsyscall
>>> +            via the parameter overrides LASS protection.
>>> +
> Sure, I will take out this part change.

Actually, having some text here might be ok. I mistook it to be placed
between the sub-options. But avoid merging it with the previous
paragraph as is the case right now.

>> Instead of doing this dance, can we provide a simplified behavior to
>> the user/admin and move the decision making to compile time?
>>
> Current strategy is to disable vsyscall by default only for LASS capable
> platforms. So that the dynamic decision is likely a necessary.
>

Making this dynamic and platform dependent would make things hard to
debug and isolate. It would be a perfect recipe for "But, it works on my
system!" type of issues.

Let's see what others have to say.

-Sohil

2023-01-21 04:28:35

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection



On Mon, Jan 9, 2023, at 9:52 PM, Yian Chen wrote:
> Kernel enables LASS automatically at starting time in LASS
> capable platforms. Any access to kernel addresses
> or upper half addresses from user space triggers a #GP
> fault.
>
> Legacy vsyscall does not comply with LASS, because
> the vsyscall functions are mapped in the range
> 0xffffffffff600000-0xffffffffff601000.
>
> In theory, it would be possible to write a #GP fault handler
> to emulate the old vsyscall behavior, but vsyscall has been
> deprecated for some time, so this has not been done.

The ISE docs are really quite explicit that #GP will have RIP pointing at the vDSO if LASS is on. Unless I’ve missed something, this should be quite straightforward to handle without breaking userspace compatibility. Let’s please do this.

>
> Therefore, when kernel enforces LASS, vsyscall does not work
> and should be disabled. On the other hand, the user can relax
> the enforcement by clearing lass cpu id (clearcpuid=lass/390)
> or enabling vsyscall (vsyscall=xxx) from kernel command line.
> The user can also opt-out LASS in config file to build kernel
> binary.
>
> Signed-off-by: Yian Chen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
> arch/x86/entry/vsyscall/vsyscall_64.c | 14 ++++++++++++++
> 2 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> index 6cfa6e3996cf..3988e0c8c175 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6755,10 +6755,14 @@
> versions of glibc use these calls. Because these
> functions are at fixed addresses, they make nice
> targets for exploits that can control RIP.
> -
> - emulate [default] Vsyscalls turn into traps and are
> - emulated reasonably safely. The vsyscall
> - page is readable.
> + In newer versions of Intel platforms that come with
> + LASS(Linear Address Space separation) protection,
> + vsyscall is disabled by default. Enabling vsyscall
> + via the parameter overrides LASS protection.
> +
> + emulate [default if not LASS capable] Vsyscalls
> + turn into traps and are emulated reasonably
> + safely. The vsyscall page is readable.
>
> xonly Vsyscalls turn into traps and are
> emulated reasonably safely. The vsyscall
> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c
> b/arch/x86/entry/vsyscall/vsyscall_64.c
> index 4af81df133ee..2691f26835d1 100644
> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> @@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
> else
> return -EINVAL;
>
> + if (cpu_feature_enabled(X86_FEATURE_LASS) &&
> + vsyscall_mode != NONE) {
> + setup_clear_cpu_cap(X86_FEATURE_LASS);
> + pr_warn("LASS disabled by command line enabling vsyscall\n");
> + }
> +
> return 0;
> }
>
> @@ -379,6 +385,14 @@ void __init map_vsyscall(void)
> extern char __vsyscall_page;
> unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
>
> + /*
> + * When LASS is on, vsyscall triggers a #GP fault,
> + * so that force vsyscall_mode to NONE.
> + */
> + if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> + vsyscall_mode = NONE;
> + return;
> + }
> /*
> * For full emulation, the page needs to exist for real. In
> * execute-only mode, there is no PTE at all backing the vsyscall
> --
> 2.34.1

2023-02-07 03:21:29

by Wang, Lei

[permalink] [raw]
Subject: Re: [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest

Could you also CC the KVM mailing list ([email protected]) for KVM guys to review?

BR,
Lei

On 1/10/2023 1:52 PM, Yian Chen wrote:
> From: Paul Lai <[email protected]>
>
> Expose LASS feature which is defined in the CPUID.7.1.EAX
> bit and enabled via the CR4 bit for VM guest.
>
> Signed-off-by: Paul Lai <[email protected]>
> Signed-off-by: Yian Chen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 3 ++-
> arch/x86/kvm/cpuid.c | 2 +-
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f35f1ff4427b..bd39f45e9b5a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -125,7 +125,8 @@
> | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
> | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
> | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> + | X86_CR4_LASS))
>
> #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b14653b61470..e0f53f85f5ae 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
>
> kvm_cpu_cap_mask(CPUID_7_1_EAX,
> F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
> - F(AVX_IFMA)
> + F(AVX_IFMA) | F(LASS)
> );
>
> kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,

2023-02-09 17:18:22

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest

On Tue, Feb 07, 2023, Wang, Lei wrote:
> Could you also CC the KVM mailing list ([email protected]) for KVM guys to review?

As Sohil pointed out[*], use scripts/get_maintainer.pl. Cc'ing kvm@ on random
bits of the series isn't sufficient, e.g. I completely missed Sohil's Cc on the
cover letter. For me personally at least, if I'm Cc'd on one patch then I want
to be Cc'd on all patches.

That's somewhat of a moot point for the future of LASS enabling, because the KVM
enabling needs to be a separate series.

[*] https://lore.kernel.org/all/[email protected]

> On 1/10/2023 1:52 PM, Yian Chen wrote:
> > From: Paul Lai <[email protected]>
> >
> > Expose LASS feature which is defined in the CPUID.7.1.EAX
> > bit and enabled via the CR4 bit for VM guest.
> >
> > Signed-off-by: Paul Lai <[email protected]>
> > Signed-off-by: Yian Chen <[email protected]>
> > Reviewed-by: Tony Luck <[email protected]>

So I'm pretty sure this "review" was to satisfy Intel's requirements to never post
anything to x86@ that wasn't reviewed internally by one of a set of select
individuals. Please drop that requirement for KVM x86, I truly think it's doing
more harm than good.

This is laughably inadqueate "enabling". This has architecturally visible effects
on _all_ guest virtual/linear accesses. That statement alone should set off alarms
left and right that simply advertising support via KVM_GET_SUPPORTED_CPUID and
marking the CR4 bit as not unconditionally reserved is insufficient.

My recommendation is to look at the touchpoints for X86_CR4_LA57 and follow the
various breadcrumbs to identify what all needs to be done to properly support LASS.

More importantly, I want tests. There's _zero_ chance this was reasonably tested.
E.g. the most basic negative testcase of attempting to set X86_CR4_LASS on a host
without LASS is missed (see __cr4_reserved_bits()).

I will not so much as glance at future versions of LASS enabling if there aren't
testscases in KVM-unit-tests and/or selftests that give me a high level of confidence
that KVM correctly handles the various edge cases, e.g. that KVM correctly handles
an implicit supervisor access from user mode while in the emulator.

That goes a for all new hardware enabling: no tests, no review. Please relay that
message to managers, leadership, and everyone working on KVM. Ample warning has
been given that new KVM features need to be accompanied by tests.

> > ---
> > arch/x86/include/asm/kvm_host.h | 3 ++-
> > arch/x86/kvm/cpuid.c | 2 +-
> > 2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index f35f1ff4427b..bd39f45e9b5a 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -125,7 +125,8 @@
> > | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
> > | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
> > | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> > - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> > + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> > + | X86_CR4_LASS))
> >
> > #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index b14653b61470..e0f53f85f5ae 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
> >
> > kvm_cpu_cap_mask(CPUID_7_1_EAX,
> > F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
> > - F(AVX_IFMA)
> > + F(AVX_IFMA) | F(LASS)
> > );
> >
> > kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,