2015-08-13 00:55:52

by Kees Cook

[permalink] [raw]
Subject: [PATCH] x86, vsyscall: add CONFIG to control default

Most modern systems can run with vsyscall=none. In an effort to provide
a way for build-time defaults to lack legacy settings, this adds a new
CONFIG to select the type of vsyscall mapping to use, similar to the
existing "vsyscall" command line parameter.

Signed-off-by: Kees Cook <[email protected]>
---
arch/x86/Kconfig | 49 +++++++++++++++++++++++++++++++++++
arch/x86/entry/vsyscall/vsyscall_64.c | 9 ++++++-
2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b3a1a5d77d92..fbd0fad714a1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2010,6 +2010,55 @@ config COMPAT_VDSO
If unsure, say N: if you are compiling your own kernel, you
are unlikely to be using a buggy version of glibc.

+choice
+ prompt "vsyscall table for legacy applications"
+ depends on X86_64
+ default LEGACY_VSYSCALL_EMULATE
+ help
+ Legacy user code that does not know how to find the vDSO expects
+ to be able to issue three syscalls by calling fixed addresses in
+ kernel space. Since this location is not randomized with ASLR,
+ it can be used to assist security vulnerability exploitation.
+
+ This setting can be changed at boot time via the kernel command
+ line parameter vsyscall=[native|emulate|none].
+
+ On a system with recent enough glibc (2.14 or newer) and no
+ static binaries, you can say None without a performance penalty
+ to improve security.
+
+ If unsure, select "Emulate".
+
+ config LEGACY_VSYSCALL_NATIVE
+ bool "Native"
+ help
+ Actual executable code is located in the fixed vsyscall
+ address mapping, implementing time() efficiently. Since
+ this makes the mapping executable, it can be used during
+ security vulnerability exploitation (traditionally as
+ ROP gadgets). This configuration is not recommended.
+
+ config LEGACY_VSYSCALL_EMULATE
+ bool "Emulate"
+ help
+ The kernel traps and emulates calls into the fixed
+ vsyscall address mapping. This makes the mapping
+ non-executable, but it still contains known contents,
+ which could be used in certain rare security vulnerability
+ exploits. This configuration is recommended when userspace
+ still uses the vsyscall area.
+
+ config LEGACY_VSYSCALL_NONE
+ bool "None"
+ help
+ There will be no vsyscall mapping at all. This will
+ eliminate any risk of ASLR bypass due to the vsyscall
+ fixed address mapping. Attempts to use the vsyscalls
+ will be reported to dmesg, so that either old or
+ malicious userspace programs can be identified.
+
+endchoice
+
config CMDLINE_BOOL
bool "Built-in kernel command line"
---help---
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 2dcc6ff6fdcc..47e2904b043b 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -38,7 +38,14 @@
#define CREATE_TRACE_POINTS
#include "vsyscall_trace.h"

-static enum { EMULATE, NATIVE, NONE } vsyscall_mode = EMULATE;
+static enum { EMULATE, NATIVE, NONE } vsyscall_mode =
+#ifdef CONFIG_LEGACY_VSYSCALL_NATIVE
+ NATIVE;
+#elif CONFIG_LEGACY_VSYSCALL_NONE
+ NONE;
+#else
+ EMULATE;
+#endif

static int __init vsyscall_setup(char *str)
{
--
1.9.1


--
Kees Cook
Chrome OS Security


2015-08-13 02:23:46

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH] x86, vsyscall: add CONFIG to control default

On Wed, Aug 12, 2015 at 05:55:19PM -0700, Kees Cook wrote:
> Most modern systems can run with vsyscall=none. In an effort to provide
> a way for build-time defaults to lack legacy settings, this adds a new
> CONFIG to select the type of vsyscall mapping to use, similar to the
> existing "vsyscall" command line parameter.
>
> Signed-off-by: Kees Cook <[email protected]>

Seems reasonable to me. One question, though: is there *any* reason to
choose "native" over "emulate"? (Does "emulate" have a sufficient
performance penalty to matter, and do people running old glibc really
care about that performance while still not wanting to upgrade?)
If there is a reason, could you please document it in the
descriptions of the "native" and "emulate" options (as an upside and a
downside, respectively)? If there isn't, you might consider a patch to
remove "native".

> arch/x86/Kconfig | 49 +++++++++++++++++++++++++++++++++++
> arch/x86/entry/vsyscall/vsyscall_64.c | 9 ++++++-
> 2 files changed, 57 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b3a1a5d77d92..fbd0fad714a1 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2010,6 +2010,55 @@ config COMPAT_VDSO
> If unsure, say N: if you are compiling your own kernel, you
> are unlikely to be using a buggy version of glibc.
>
> +choice
> + prompt "vsyscall table for legacy applications"
> + depends on X86_64
> + default LEGACY_VSYSCALL_EMULATE
> + help
> + Legacy user code that does not know how to find the vDSO expects
> + to be able to issue three syscalls by calling fixed addresses in
> + kernel space. Since this location is not randomized with ASLR,
> + it can be used to assist security vulnerability exploitation.
> +
> + This setting can be changed at boot time via the kernel command
> + line parameter vsyscall=[native|emulate|none].
> +
> + On a system with recent enough glibc (2.14 or newer) and no
> + static binaries, you can say None without a performance penalty
> + to improve security.
> +
> + If unsure, select "Emulate".
> +
> + config LEGACY_VSYSCALL_NATIVE
> + bool "Native"
> + help
> + Actual executable code is located in the fixed vsyscall
> + address mapping, implementing time() efficiently. Since
> + this makes the mapping executable, it can be used during
> + security vulnerability exploitation (traditionally as
> + ROP gadgets). This configuration is not recommended.
> +
> + config LEGACY_VSYSCALL_EMULATE
> + bool "Emulate"
> + help
> + The kernel traps and emulates calls into the fixed
> + vsyscall address mapping. This makes the mapping
> + non-executable, but it still contains known contents,
> + which could be used in certain rare security vulnerability
> + exploits. This configuration is recommended when userspace
> + still uses the vsyscall area.
> +
> + config LEGACY_VSYSCALL_NONE
> + bool "None"
> + help
> + There will be no vsyscall mapping at all. This will
> + eliminate any risk of ASLR bypass due to the vsyscall
> + fixed address mapping. Attempts to use the vsyscalls
> + will be reported to dmesg, so that either old or
> + malicious userspace programs can be identified.
> +
> +endchoice
> +
> config CMDLINE_BOOL
> bool "Built-in kernel command line"
> ---help---
> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
> index 2dcc6ff6fdcc..47e2904b043b 100644
> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> @@ -38,7 +38,14 @@
> #define CREATE_TRACE_POINTS
> #include "vsyscall_trace.h"
>
> -static enum { EMULATE, NATIVE, NONE } vsyscall_mode = EMULATE;
> +static enum { EMULATE, NATIVE, NONE } vsyscall_mode =
> +#ifdef CONFIG_LEGACY_VSYSCALL_NATIVE
> + NATIVE;
> +#elif CONFIG_LEGACY_VSYSCALL_NONE
> + NONE;
> +#else
> + EMULATE;
> +#endif
>
> static int __init vsyscall_setup(char *str)
> {
> --
> 1.9.1
>
>
> --
> Kees Cook
> Chrome OS Security

2015-08-31 20:13:19

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] x86, vsyscall: add CONFIG to control default

On Wed, Aug 12, 2015 at 7:23 PM, Josh Triplett <[email protected]> wrote:
> On Wed, Aug 12, 2015 at 05:55:19PM -0700, Kees Cook wrote:
>> Most modern systems can run with vsyscall=none. In an effort to provide
>> a way for build-time defaults to lack legacy settings, this adds a new
>> CONFIG to select the type of vsyscall mapping to use, similar to the
>> existing "vsyscall" command line parameter.
>>
>> Signed-off-by: Kees Cook <[email protected]>
>
> Seems reasonable to me. One question, though: is there *any* reason to
> choose "native" over "emulate"? (Does "emulate" have a sufficient
> performance penalty to matter, and do people running old glibc really
> care about that performance while still not wanting to upgrade?)
> If there is a reason, could you please document it in the
> descriptions of the "native" and "emulate" options (as an upside and a
> downside, respectively)? If there isn't, you might consider a patch to
> remove "native".

I think "native" is available out of an abundance of caution. Andy
left it available, though I'm not sure if he had plans to remove
"native" entirely.

Can someone from the x86 tree take this patch, or are there other
things to improve?

Thanks!

-Kees

--
Kees Cook
Chrome OS Security

2015-08-31 21:24:15

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86, vsyscall: add CONFIG to control default

On Aug 31, 2015 1:13 PM, "Kees Cook" <[email protected]> wrote:
>
> On Wed, Aug 12, 2015 at 7:23 PM, Josh Triplett <[email protected]> wrote:
> > On Wed, Aug 12, 2015 at 05:55:19PM -0700, Kees Cook wrote:
> >> Most modern systems can run with vsyscall=none. In an effort to provide
> >> a way for build-time defaults to lack legacy settings, this adds a new
> >> CONFIG to select the type of vsyscall mapping to use, similar to the
> >> existing "vsyscall" command line parameter.
> >>
> >> Signed-off-by: Kees Cook <[email protected]>
> >
> > Seems reasonable to me. One question, though: is there *any* reason to
> > choose "native" over "emulate"? (Does "emulate" have a sufficient
> > performance penalty to matter, and do people running old glibc really
> > care about that performance while still not wanting to upgrade?)
> > If there is a reason, could you please document it in the
> > descriptions of the "native" and "emulate" options (as an upside and a
> > downside, respectively)? If there isn't, you might consider a patch to
> > remove "native".
>
> I think "native" is available out of an abundance of caution. Andy
> left it available, though I'm not sure if he had plans to remove
> "native" entirely.

Native adds almost no code and almost no maintenance burden -- it's
really just a PTE bit.

>
> Can someone from the x86 tree take this patch, or are there other
> things to improve?

It looks good to me.

I was thinking about how to control vsyscalls per process, and it's
not so easy. We can turn off emulation per process trivially (modulo
figuring out the ABI), but the Project Zero thing makes me think that
we want to be able to switch off *read* access.

For almost all purposes, we could just switch off read access globally
with no ill effects. The problem is that nasty little programs like
pin will start crashing when run on old binaries.

We could allocate two copies of the top pud, switch them out in the
pgd depending on whether vsyscalls are on for the mm, and clearing the
G bit. It's a bit of a departure for how things work now, and it'll
interact really weirdly with the fixmap code and anything else that
pokes at that part of the kernel page tables (e.g. Xen?) Hmm.

--Andy