2008-02-12 17:00:29

by Arjan van de Ven

[permalink] [raw]
Subject: vmsplice exploits, stack protector and Makefiles


Hi,

I just read the excellent LWN writeup of the vmsplice security thing, and that got me
wondering why this attack wasn't stopped by the CONFIG_CC_STACKPROTECTOR option... because
it plain should have been...

some analysis later.. it turns out that the following line in the top level Makefile,
added by you in October 2007, entirely disables CONFIG_CC_STACKPROTECTOR ;(
With this line removed the exploit will be nicely stopped.

# Force gcc to behave correct even for buggy distributions
CFLAGS += $(call cc-option, -fno-stack-protector)


Now I realize that certain distros have patched gcc to compensate for their lack of distro
wide CFLAGS, and it's great to work around that... but would there be a way to NOT
disable this for CONFIG_CC_STACKPROTECTOR please? It would have made this
exploit not possible for those kernels that enable this feature (and that includes distros
like Fedora)

Greetings,
Arjan van de Ven

--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org


2008-02-12 18:50:24

by Sam Ravnborg

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On Tue, Feb 12, 2008 at 09:00:01AM -0800, Arjan van de Ven wrote:
>
> Hi,
>
> I just read the excellent LWN writeup of the vmsplice security thing, and that got me
> wondering why this attack wasn't stopped by the CONFIG_CC_STACKPROTECTOR option... because
> it plain should have been...
>
> some analysis later.. it turns out that the following line in the top level Makefile,
> added by you in October 2007, entirely disables CONFIG_CC_STACKPROTECTOR ;(
> With this line removed the exploit will be nicely stopped.
>
> # Force gcc to behave correct even for buggy distributions
> CFLAGS += $(call cc-option, -fno-stack-protector)
>
>
> Now I realize that certain distros have patched gcc to compensate for their lack of distro
> wide CFLAGS, and it's great to work around that... but would there be a way to NOT
> disable this for CONFIG_CC_STACKPROTECTOR please? It would have made this
> exploit not possible for those kernels that enable this feature (and that includes distros
> like Fedora)

I guess the problem is that we in arch/x86/Makefile enable the stackprotector
but then later in the main Makefile disables it.
So the right way to approach this should be to always disable it and the reenable it in
the arch Makefile.
So something like this?

Sam

diff --git a/Makefile b/Makefile
index c162370..7ccf6f5 100644
--- a/Makefile
+++ b/Makefile
@@ -507,6 +507,9 @@ else
KBUILD_CFLAGS += -O2
endif

+# Force gcc to behave correct even for buggy distributions
+KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
+
include $(srctree)/arch/$(SRCARCH)/Makefile

ifdef CONFIG_FRAME_POINTER
@@ -525,9 +528,6 @@ ifdef CONFIG_DEBUG_SECTION_MISMATCH
KBUILD_CFLAGS += $(call cc-option, -fno-inline-functions-called-once)
endif

-# Force gcc to behave correct even for buggy distributions
-KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
-
# arch Makefile may override CC so keep this after arch Makefile is included
NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
CHECKFLAGS += $(NOSTDINC_FLAGS)

2008-02-12 19:08:49

by Arjan van de Ven

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On Tue, 12 Feb 2008 19:50:12 +0100
Sam Ravnborg <[email protected]> wrote:
> >
> > Now I realize that certain distros have patched gcc to compensate
> > for their lack of distro wide CFLAGS, and it's great to work around
> > that... but would there be a way to NOT disable this for
> > CONFIG_CC_STACKPROTECTOR please? It would have made this exploit
> > not possible for those kernels that enable this feature (and that
> > includes distros like Fedora)
>
> I guess the problem is that we in arch/x86/Makefile enable the
> stackprotector but then later in the main Makefile disables it.
> So the right way to approach this should be to always disable it and
> the reenable it in the arch Makefile.
> So something like this?


the patch works fine for me.
Linus, can we please get this merged into .25? it will at least
again limit the damage exploits like the vmsplice one can do..

(I think it's also worth it for -stable)

2008-02-12 19:36:56

by Sam Ravnborg

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On Tue, Feb 12, 2008 at 11:08:18AM -0800, Arjan van de Ven wrote:
> On Tue, 12 Feb 2008 19:50:12 +0100
> Sam Ravnborg <[email protected]> wrote:
> > >
> > > Now I realize that certain distros have patched gcc to compensate
> > > for their lack of distro wide CFLAGS, and it's great to work around
> > > that... but would there be a way to NOT disable this for
> > > CONFIG_CC_STACKPROTECTOR please? It would have made this exploit
> > > not possible for those kernels that enable this feature (and that
> > > includes distros like Fedora)
> >
> > I guess the problem is that we in arch/x86/Makefile enable the
> > stackprotector but then later in the main Makefile disables it.
> > So the right way to approach this should be to always disable it and
> > the reenable it in the arch Makefile.
> > So something like this?
>
>
> the patch works fine for me.
> Linus, can we please get this merged into .25? it will at least
> again limit the damage exploits like the vmsplice one can do..
>
> (I think it's also worth it for -stable)

I will prepare it for my next -fix round. Due in a few days.

Sam

2008-02-13 15:17:54

by PaX Team

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On 12 Feb 2008 at 9:00, Arjan van de Ven wrote:
> I just read the excellent LWN writeup of the vmsplice security thing, and that got me
> wondering why this attack wasn't stopped by the CONFIG_CC_STACKPROTECTOR option... because
> it plain should have been...

what makes you think it should have been? you probably didn't even look,
there're so many problems with ssp here and in general.

1. despite the nice analysis in LWN, it unfortunately stopped half way where
Jon declared that:

And this turns the failure to read-verify the source array into a buffer
overflow vulnerability within the kernel. Once that is in place, it is a
relatively straightforward exercise for any suitably 31337 hacker to
cause the kernel to jump into the code of his or her choice. Game over.

fact of the matter is, it's far from game over at this point. and that's
because despite all appearances, this is far from your run-of-the-mill
stack buffer overflow. in particular, the overflow does *not* rely on
overwriting any saved return address on the stack.

the trickery with the fake compound page set up should have been a sign
that something else is going on here and the lesson ain't over until you
understand that part as well.

2. so why isn't this exploit about an overflowed return address? because
before any potentially overflowed return address would be dereferenced,
the kernel will attempt to release the page refcounts it acquired in
get_user_pages (check splice_to_pipe, called from vmsplice_to_pipe
where the overflowed buffer is). normally that wouldn't be a problem
since even though the pages[PIPE_BUFFERS] array was overflowed, it was
overflowed with valid struct page pointers, so releasing them should
be fine. except there's a trick in the exploit that will cause a userland
controlled struct page pointer to be released as well at which point
the exploit takes matters into its own hands:

previous to calling vmsplice, the exploit has prepared a specially
constructed struct page array to fake a compound page. the point in
using a compound page is that such a page has a destructor (read:
function pointer) which is now under the direct control of the exploit
and will result in ring-0 code execution of exploit code. *this* is
the point where it becomes a trivial exercise to escalate privileges.

3. what's ssp got to do with all this? Arjan would have you believe that
it would have caught this exploit in action, preventing the privilege
escalation. nothing could be further from the truth though.

for one if ssp were to kick into action, it could at most be at the
time vmsplice_to_pipe returns. except it doesn't as explained above.

but imagine for a second that vmsplice_to_pipe does return. would ssp
detect anything? at least gentoo's gcc 4.2.2 doesn't at all instrument
vmsplice_to_pipe with -fstack-protector, so no dice.

let's go further and imagine we enable CONFIG_CC_STACKPROTECTOR_ALL
(which in turn makes gcc use -fstack-protector-all). this will now
properly instrument vmsplice_to_pipe. problem is, such a kernel doesn't
boot. probably noone has ever tried it (why does it have a config option?).
the fix isn't trivial and possibly incomplete, see the patches at the end.

finally just imagine that ssp somehow caught this or another exploit in
action. what will we learn about it? nothing. that's right, due to bad
decisions made by certain developers (both gcc and kernel),__stack_chk_fail
doesn't get passed any extra info (unlike Etoh's original ssp), nor does
the kernel's version produce any actually useful output that, pray tell,
could help identify the attacked function. instead it just panic's. a truly
useful experience.

it also bears a note here that the way ssp is currently implemented in
the kernel is quite useless, a per-task static cookie is trivial to learn
in a kernel info-leak exploit that in turn can be built into the actual
attack payload to bypass any detection (a case that ssp was designed to
protect against, this particular bug/exploit doesn't give direct control
over the overflow content, hence even the current cookie method would have
been fine, were it not irrelevant for reasons explained above).

[...]
> It would have made this exploit not possible for those kernels that
> enable this feature (and that includes distros like Fedora)

sadly for all those users living in a false sense of security, this has
never been true. but long live cargo cult security.

---
patches to get CONFIG_CC_STACKPROTECTOR_ALL actually to work (it includes
the Makefile patch proposed in this thread already).

note that the fix to ACPI is an actual stack corruption bug (caught by ssp
thanks to a lucky stack layout), due to the misuse of the pci accessor
functions, probably a whole-tree audit is in order for similar bugs.

note also that the vsyscall functions (more precisely, all the code that
goes into .vsyscall* sections) had better be separated into their own .c
files so that they can be compiled without -mcmodel=kernel and use %fs for
getting the ssp cookie, if ssp is desired at all there).
---
diff -u linux-2.6.24.2-pax/Makefile linux-2.6.24.2-pax/Makefile
--- linux-2.6.24.2-pax/Makefile 2008-02-11 17:24:34.000000000 +0100
+++ linux-2.6.24.2-pax/Makefile 2008-02-12 20:31:56.000000000 +0100
@@ -507,6 +507,9 @@
KBUILD_CFLAGS += -O2
endif

+# Force gcc to behave correct even for buggy distributions
+KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
+
include $(srctree)/arch/$(SRCARCH)/Makefile

ifdef CONFIG_FRAME_POINTER
@@ -520,9 +523,6 @@
KBUILD_AFLAGS += -gdwarf-2
endif

-# Force gcc to behave correct even for buggy distributions
-KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
-
# arch Makefile may override CC so keep this after arch Makefile is included
NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
CHECKFLAGS += $(NOSTDINC_FLAGS)
diff -u linux-2.6.24.2-pax/arch/x86/kernel/entry_64.S linux-2.6.24.2-
pax/arch/x86/kernel/entry_64.S
--- linux-2.6.24.2-pax/arch/x86/kernel/entry_64.S 2008-01-25
15:34:25.000000000 +0100
+++ linux-2.6.24.2-pax/arch/x86/kernel/entry_64.S 2008-02-13
11:12:26.000000000 +0100
@@ -440,6 +440,7 @@
CFI_REGISTER rip, r11
SAVE_REST
FIXUP_TOP_OF_STACK %r11
+ movq %rsp, %rcx
call sys_execve
RESTORE_TOP_OF_STACK %r11
movq %rax,RAX(%rsp)
@@ -1004,15 +1005,16 @@
* rdi: name, rsi: argv, rdx: envp
*
* We want to fallback into:
- * extern long sys_execve(char *name, char **argv,char **envp, struct
pt_regs regs)
+ * extern long sys_execve(char *name, char **argv,char **envp, struct
pt_regs *regs)
*
* do_sys_execve asm fallback arguments:
- * rdi: name, rsi: argv, rdx: envp, fake frame on the stack
+ * rdi: name, rsi: argv, rdx: envp, rcx: fake frame on the stack
*/
ENTRY(kernel_execve)
CFI_STARTPROC
FAKE_STACK_FRAME $0
SAVE_ALL
+ movq %rsp,%rcx
call sys_execve
movq %rax, RAX(%rsp)
RESTORE_REST
diff -u linux-2.6.24.2-pax/arch/x86/kernel/process_64.c linux-2.6.24.2-
pax/arch/x86/kernel/process_64.c
--- linux-2.6.24.2-pax/arch/x86/kernel/process_64.c 2008-01-25
15:34:25.000000000 +0100
+++ linux-2.6.24.2-pax/arch/x86/kernel/process_64.c 2008-02-13
11:13:14.000000000 +0100
@@ -673,7 +673,6 @@
write_pda(kernelstack,
(unsigned long)task_stack_page(next_p) + THREAD_SIZE -
PDA_STACKOFFSET);
#ifdef CONFIG_CC_STACKPROTECTOR
- write_pda(stack_canary, next_p->stack_canary);
/*
* Build time only check to make sure the stack_canary is at
* offset 40 in the pda; this is a gcc ABI requirement
@@ -702,7 +701,7 @@
*/
asmlinkage
long sys_execve(char __user *name, char __user * __user *argv,
- char __user * __user *envp, struct pt_regs regs)
+ char __user * __user *envp, struct pt_regs *regs)
{
long error;
char * filename;
@@ -711,7 +710,7 @@
error = PTR_ERR(filename);
if (IS_ERR(filename))
return error;
- error = do_execve(filename, argv, envp, &regs);
+ error = do_execve(filename, argv, envp, regs);
if (error == 0) {
task_lock(current);
current->ptrace &= ~PT_DTRACE;
diff -u linux-2.6.24.2-pax/drivers/acpi/osl.c linux-2.6.24.2-
pax/drivers/acpi/osl.c
--- linux-2.6.24.2-pax/drivers/acpi/osl.c 2008-02-08 22:34:51.000000000
+0100
+++ linux-2.6.24.2-pax/drivers/acpi/osl.c 2008-02-13 03:06:49.000000000
+0100
@@ -524,7 +524,7 @@

acpi_status
acpi_os_read_pci_configuration(struct acpi_pci_id * pci_id, u32 reg,
- void *value, u32 width)
+ u32 *value, u32 width)
{
int result, size;

@@ -596,7 +596,7 @@
acpi_status status;
unsigned long temp;
acpi_object_type type;
- u8 tu8;
+ u32 tu8;

acpi_get_parent(chandle, &handle);
if (handle != rhandle) {
diff -u linux-2.6.24.2-pax/include/asm-x86/system_64.h linux-2.6.24.2-
pax/include/asm-x86/system_64.h
--- linux-2.6.24.2-pax/include/asm-x86/system_64.h 2008-01-25
15:34:27.000000000 +0100
+++ linux-2.6.24.2-pax/include/asm-x86/system_64.h 2008-02-12
22:03:42.000000000 +0100
@@ -33,6 +33,8 @@
".globl thread_return\n" \
"thread_return:\n\t" \
"movq %%gs:%P[pda_pcurrent],%%rsi\n\t"
\
+ "movq %P[task_canary](%%rsi),%%r8\n\t"
\
+ "movq %%r8,%%gs:%P[pda_canary]\n\t" \
"movq %P[thread_info](%%rsi),%%r8\n\t"
\
LOCK_PREFIX "btr %[tif_fork],%P[ti_flags](%%r8)\n\t"
\
"movq %%rax,%%rdi\n\t"
\
@@ -44,7 +46,9 @@
[ti_flags] "i" (offsetof(struct thread_info, flags)),\
[tif_fork] "i" (TIF_FORK), \
[thread_info] "i" (offsetof(struct task_struct, stack)),
\
- [pda_pcurrent] "i" (offsetof(struct x8664_pda, pcurrent))
\
+ [task_canary] "i" (offsetof(struct task_struct,
stack_canary)), \
+ [pda_pcurrent] "i" (offsetof(struct x8664_pda,
pcurrent)), \
+ [pda_canary] "i" (offsetof(struct x8664_pda,
stack_canary)) \
: "memory", "cc" __EXTRA_CLOBBER)

extern void load_gs_index(unsigned);
--- linux-2.6.24.2/arch/x86/kernel/Makefile_64 2008-01-24 23:58:37.000000000
+0100
+++ linux-2.6.24.2-pax/arch/x86/kernel/Makefile_64 2008-02-13
11:36:14.000000000 +0100
@@ -42,4 +42,6 @@ obj-$(CONFIG_PCI) += early-quirks.o
obj-y += topology.o
obj-y += pcspeaker.o

-CFLAGS_vsyscall_64.o := $(PROFILING) -g0
+CFLAGS_vsyscall_64.o := $(PROFILING) -g0 -fno-stack-protector
+CFLAGS_hpet.o := -fno-stack-protector
+CFLAGS_tsc_64.o := -fno-stack-protector
--- linux-2.6.24.2/include/acpi/acpiosxf.h 2008-01-24 23:58:37.000000000
+0100
+++ linux-2.6.24.2-pax/include/acpi/acpiosxf.h 2008-02-13 03:07:31.000000000
+0100
@@ -219,7 +219,7 @@ acpi_os_write_memory(acpi_physical_addre
*/
acpi_status
acpi_os_read_pci_configuration(struct acpi_pci_id *pci_id,
- u32 reg, void *value, u32 width);
+ u32 reg, u32 *value, u32 width);

acpi_status
acpi_os_write_pci_configuration(struct acpi_pci_id *pci_id,
only in patch2:
unchanged:
--- linux-2.6.24.2/kernel/panic.c 2008-01-24 23:58:37.000000000 +0100
+++ linux-2.6.24.2-pax/kernel/panic.c 2008-02-13 11:39:28.000000000 +0100
@@ -20,6 +20,7 @@
#include <linux/kexec.h>
#include <linux/debug_locks.h>
#include <linux/random.h>
+#include <linux/kallsyms.h>

int panic_on_oops;
int tainted;
@@ -299,6 +300,8 @@ void oops_exit(void)
*/
void __stack_chk_fail(void)
{
+ print_symbol("stack corrupted in: %s\n", (unsigned
long)__builtin_return_address(0));
+ dump_stack();
panic("stack-protector: Kernel stack is corrupted");
}
EXPORT_SYMBOL(__stack_chk_fail);

2008-02-13 15:29:33

by Ingo Molnar

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


* [email protected] <[email protected]> wrote:

> patches to get CONFIG_CC_STACKPROTECTOR_ALL actually to work (it
> includes the Makefile patch proposed in this thread already).
>
> note that the fix to ACPI is an actual stack corruption bug (caught by
> ssp thanks to a lucky stack layout), due to the misuse of the pci
> accessor functions, probably a whole-tree audit is in order for
> similar bugs.
>
> note also that the vsyscall functions (more precisely, all the code
> that goes into .vsyscall* sections) had better be separated into their
> own .c files so that they can be compiled without -mcmodel=kernel and
> use %fs for getting the ssp cookie, if ssp is desired at all there).

thanks, i've picked up your patch into x86.git#mm and also made
stackprotector-all default-enabled so that we get more test coverage of
this critical security feature. x86.git#mm can be picked up via:

http://people.redhat.com/mingo/x86.git/README

head of the tree:

---------------->
commit e1d96d3e489d02b12984fb3c755b0f9a9ae0fe5f
Author: Ingo Molnar <[email protected]>
Date: Wed Feb 13 16:15:34 2008 +0100

x86: enable stack-protector by default

also enable the rodata and nx tests.
<----------------

your patch booted fine here with stackprotector-all enabled.

Ingo

2008-02-13 15:54:40

by Linus Torvalds

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


Ingo,
if you're merging this, please do the independent parts really
independenrly. For example, the above is a patch in its own right, and
probably worth doing regardless of anything else.

(Same goes for the ACPI parts, I'll bounce that part to Len,

Linus

On Wed, 13 Feb 2008, [email protected] wrote:
>
> diff -u linux-2.6.24.2-pax/arch/x86/kernel/entry_64.S linux-2.6.24.2-pax/arch/x86/kernel/entry_64.S
> --- linux-2.6.24.2-pax/arch/x86/kernel/entry_64.S 2008-01-25 15:34:25.000000000 +0100
> +++ linux-2.6.24.2-pax/arch/x86/kernel/entry_64.S 2008-02-13 11:12:26.000000000 +0100
> @@ -440,6 +440,7 @@
> CFI_REGISTER rip, r11
> SAVE_REST
> FIXUP_TOP_OF_STACK %r11
> + movq %rsp, %rcx
> call sys_execve
> RESTORE_TOP_OF_STACK %r11
> movq %rax,RAX(%rsp)
> @@ -1004,15 +1005,16 @@
> * rdi: name, rsi: argv, rdx: envp
> *
> * We want to fallback into:
> - * extern long sys_execve(char *name, char **argv,char **envp, struct pt_regs regs)
> + * extern long sys_execve(char *name, char **argv,char **envp, struct pt_regs *regs)
> *
> * do_sys_execve asm fallback arguments:
> - * rdi: name, rsi: argv, rdx: envp, fake frame on the stack
> + * rdi: name, rsi: argv, rdx: envp, rcx: fake frame on the stack
> */
> ENTRY(kernel_execve)
> CFI_STARTPROC
> FAKE_STACK_FRAME $0
> SAVE_ALL
> + movq %rsp,%rcx
> call sys_execve
> movq %rax, RAX(%rsp)
> RESTORE_REST
> diff -u linux-2.6.24.2-pax/arch/x86/kernel/process_64.c linux-2.6.24.2-pax/arch/x86/kernel/process_64.c
> --- linux-2.6.24.2-pax/arch/x86/kernel/process_64.c 2008-01-25 15:34:25.000000000 +0100
> +++ linux-2.6.24.2-pax/arch/x86/kernel/process_64.c 2008-02-13 11:13:14.000000000 +0100
> @@ -702,7 +701,7 @@
> */
> asmlinkage
> long sys_execve(char __user *name, char __user * __user *argv,
> - char __user * __user *envp, struct pt_regs regs)
> + char __user * __user *envp, struct pt_regs *regs)
> {
> long error;
> char * filename;
> @@ -711,7 +710,7 @@
> error = PTR_ERR(filename);
> if (IS_ERR(filename))
> return error;
> - error = do_execve(filename, argv, envp, &regs);
> + error = do_execve(filename, argv, envp, regs);
> if (error == 0) {
> task_lock(current);
> current->ptrace &= ~PT_DTRACE;

2008-02-13 16:02:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


* Linus Torvalds <[email protected]> wrote:

> if you're merging this, please do the independent parts really
> independenrly. For example, the above is a patch in its own right, and
> probably worth doing regardless of anything else.

yes. I wanted to have it tested for a bit, because the lack of use of
the gcc flag means complete lack of testing coverage. The obligatory
"please add a stackprotector self-test" debug feature request went to
Arjan two days ago already.

Ingo

2008-02-13 16:29:56

by Randy Dunlap

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On Wed, 13 Feb 2008 16:29:00 +0100 Ingo Molnar wrote:

>
> * [email protected] <[email protected]> wrote:
>
> > patches to get CONFIG_CC_STACKPROTECTOR_ALL actually to work (it
> > includes the Makefile patch proposed in this thread already).
> >
> > note that the fix to ACPI is an actual stack corruption bug (caught by
> > ssp thanks to a lucky stack layout), due to the misuse of the pci
> > accessor functions, probably a whole-tree audit is in order for
> > similar bugs.
> >
> > note also that the vsyscall functions (more precisely, all the code
> > that goes into .vsyscall* sections) had better be separated into their
> > own .c files so that they can be compiled without -mcmodel=kernel and
> > use %fs for getting the ssp cookie, if ssp is desired at all there).
>
> thanks, i've picked up your patch into x86.git#mm and also made
> stackprotector-all default-enabled so that we get more test coverage of
> this critical security feature. x86.git#mm can be picked up via:
>
> http://people.redhat.com/mingo/x86.git/README
>
> head of the tree:
>
> ---------------->
> commit e1d96d3e489d02b12984fb3c755b0f9a9ae0fe5f
> Author: Ingo Molnar <[email protected]>
> Date: Wed Feb 13 16:15:34 2008 +0100
>
> x86: enable stack-protector by default
>
> also enable the rodata and nx tests.
> <----------------
>
> your patch booted fine here with stackprotector-all enabled.

Is it signed-off-by: pageexec ?

Couldn't that be a problem?

---
~Randy

2008-02-13 16:49:41

by PaX Team

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On 13 Feb 2008 at 8:29, Randy Dunlap wrote:

> Is it signed-off-by: pageexec ?

no it isn't, on purpose as i won't give out my real name that the
DCO requires.

> Couldn't that be a problem?

no it couldn't. no employer -> no problem. the little pleasures of life.

2008-02-13 16:50:11

by Ingo Molnar

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


* Ingo Molnar <[email protected]> wrote:

> thanks, i've picked up your patch into x86.git#mm and also made
> stackprotector-all default-enabled so that we get more test coverage
> of this critical security feature. x86.git#mm can be picked up via:

hm, had to pull it again because it crashed in testing:

CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
Brought up 2 CPUs
stack corrupted in: thread_return+0xd0/0xe2
Pid: 0, comm: swapper Not tainted 2.6.25-rc1 #3

Call Trace:
[<ffffffff80242018>] __stack_chk_fail+0x2a/0x68
[<ffffffff807a0478>] thread_return+0xd0/0xe2
[<ffffffff8025f5c3>] getnstimeofday+0x3f/0xab
[<ffffffff8025cfc7>] ktime_get_ts+0x27/0x6f
[<ffffffff8025d02b>] ktime_get+0x1c/0x66
[<ffffffff802639af>] tick_nohz_stop_idle+0x39/0x76
[<ffffffff8020b204>] cpu_idle+0xb6/0xca

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted

config attached.

Ingo


Attachments:
(No filename) (963.00 B)
config (60.84 kB)
Download all attachments

2008-02-13 17:16:15

by PaX Team

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On 13 Feb 2008 at 17:48, Ingo Molnar wrote:

> hm, had to pull it again because it crashed in testing:

i've only tested .24, not .25 so maybe something changed. did you
make sure that

write_pda(stack_canary, next_p->stack_canary);

was removed from arch/x86/kernel/process_64.c:__switch_to? that's
the only reason i can think of that would trigger this trace.

> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
> Brought up 2 CPUs
> stack corrupted in: thread_return+0xd0/0xe2
> Pid: 0, comm: swapper Not tainted 2.6.25-rc1 #3
>
> Call Trace:
> [<ffffffff80242018>] __stack_chk_fail+0x2a/0x68
> [<ffffffff807a0478>] thread_return+0xd0/0xe2
> [<ffffffff8025f5c3>] getnstimeofday+0x3f/0xab
> [<ffffffff8025cfc7>] ktime_get_ts+0x27/0x6f
> [<ffffffff8025d02b>] ktime_get+0x1c/0x66
> [<ffffffff802639af>] tick_nohz_stop_idle+0x39/0x76
> [<ffffffff8020b204>] cpu_idle+0xb6/0xca
>
> Kernel panic - not syncing: stack-protector: Kernel stack is corrupted
>
> config attached.
>
> Ingo
>


2008-02-13 17:16:35

by Sam Ravnborg

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On Wed, Feb 13, 2008 at 05:01:48PM +0100, Ingo Molnar wrote:
>
> * Linus Torvalds <[email protected]> wrote:
>
> > if you're merging this, please do the independent parts really
> > independenrly. For example, the above is a patch in its own right, and
> > probably worth doing regardless of anything else.
>
> yes. I wanted to have it tested for a bit, because the lack of use of
> the gcc flag means complete lack of testing coverage. The obligatory
> "please add a stackprotector self-test" debug feature request went to
> Arjan two days ago already.

Do you think that the top-lvel Makefile change is OK to merge?
I would like to merge it separatly to have maximum bisectability.

Sam

2008-02-14 06:12:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


* Sam Ravnborg <[email protected]> wrote:

> > > if you're merging this, please do the independent parts really
> > > independenrly. For example, the above is a patch in its own right,
> > > and probably worth doing regardless of anything else.
> >
> > yes. I wanted to have it tested for a bit, because the lack of use
> > of the gcc flag means complete lack of testing coverage. The
> > obligatory "please add a stackprotector self-test" debug feature
> > request went to Arjan two days ago already.
>
> Do you think that the top-lvel Makefile change is OK to merge? I would
> like to merge it separatly to have maximum bisectability.

sure.

Acked-by: Ingo Molnar <[email protected]>

Ingo

2008-02-14 06:17:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


* [email protected] <[email protected]> wrote:

> > hm, had to pull it again because it crashed in testing:
>
> i've only tested .24, not .25 so maybe something changed. did you make
> sure that
>
> write_pda(stack_canary, next_p->stack_canary);
>
> was removed from arch/x86/kernel/process_64.c:__switch_to? that's the
> only reason i can think of that would trigger this trace.

I hand-ported your fixes [the patch was whitespace damaged] so i'm quite
sure i got every bit of it - but find it below for reference. I think
the percpu changes in .25 might have interfered somewhere. Will
investigate.

Ingo

--------------->
Subject: x86: fix stack protector and Makefiles
From: [email protected]
Date: Wed, 13 Feb 2008 15:38:45 +0200

there're so many problems with ssp here and in general.

1. despite the nice analysis in LWN, it unfortunately stopped half way where
Jon declared that:

And this turns the failure to read-verify the source array into a buffer
overflow vulnerability within the kernel. Once that is in place, it is a
relatively straightforward exercise for any suitably 31337 hacker to
cause the kernel to jump into the code of his or her choice. Game over.

fact of the matter is, it's far from game over at this point. and that's
because despite all appearances, this is far from your run-of-the-mill
stack buffer overflow. in particular, the overflow does *not* rely on
overwriting any saved return address on the stack.

the trickery with the fake compound page set up should have been a sign
that something else is going on here and the lesson ain't over until you
understand that part as well.

2. so why isn't this exploit about an overflowed return address? because
before any potentially overflowed return address would be dereferenced,
the kernel will attempt to release the page refcounts it acquired in
get_user_pages (check splice_to_pipe, called from vmsplice_to_pipe
where the overflowed buffer is). normally that wouldn't be a problem
since even though the pages[PIPE_BUFFERS] array was overflowed, it was
overflowed with valid struct page pointers, so releasing them should
be fine. except there's a trick in the exploit that will cause a userland
controlled struct page pointer to be released as well at which point
the exploit takes matters into its own hands:

previous to calling vmsplice, the exploit has prepared a specially
constructed struct page array to fake a compound page. the point in
using a compound page is that such a page has a destructor (read:
function pointer) which is now under the direct control of the exploit
and will result in ring-0 code execution of exploit code. *this* is
the point where it becomes a trivial exercise to escalate privileges.

3. what's ssp got to do with all this? Arjan would have you believe that
it would have caught this exploit in action, preventing the privilege
escalation. nothing could be further from the truth though.

for one if ssp were to kick into action, it could at most be at the
time vmsplice_to_pipe returns. except it doesn't as explained above.

but imagine for a second that vmsplice_to_pipe does return. would ssp
detect anything? at least gentoo's gcc 4.2.2 doesn't at all instrument
vmsplice_to_pipe with -fstack-protector, so no dice.

let's go further and imagine we enable CONFIG_CC_STACKPROTECTOR_ALL
(which in turn makes gcc use -fstack-protector-all). this will now
properly instrument vmsplice_to_pipe. problem is, such a kernel doesn't
boot. probably noone has ever tried it (why does it have a config option?).
the fix isn't trivial and possibly incomplete, see the patches at the end.

finally just imagine that ssp somehow caught this or another exploit in
action. what will we learn about it? nothing. that's right, due to bad
decisions made by certain developers (both gcc and kernel),__stack_chk_fail
doesn't get passed any extra info (unlike Etoh's original ssp), nor does
the kernel's version produce any actually useful output that, pray tell,
could help identify the attacked function. instead it just panic's. a truly
useful experience.

it also bears a note here that the way ssp is currently implemented in
the kernel is quite useless, a per-task static cookie is trivial to learn
in a kernel info-leak exploit that in turn can be built into the actual
attack payload to bypass any detection (a case that ssp was designed to
protect against, this particular bug/exploit doesn't give direct control
over the overflow content, hence even the current cookie method would have
been fine, were it not irrelevant for reasons explained above).

[...]
> It would have made this exploit not possible for those kernels that
> enable this feature (and that includes distros like Fedora)

sadly for all those users living in a false sense of security, this has
never been true. but long live cargo cult security.

patches to get CONFIG_CC_STACKPROTECTOR_ALL actually to work (it includes
the Makefile patch proposed in this thread already).

note also that the vsyscall functions (more precisely, all the code that
goes into .vsyscall* sections) had better be separated into their own .c
files so that they can be compiled without -mcmodel=kernel and use %fs for
getting the ssp cookie, if ssp is desired at all there).

Signed-off-by: Ingo Molnar <[email protected]>
---
Makefile | 6 +++---
arch/x86/kernel/Makefile | 8 ++++++++
arch/x86/kernel/entry_64.S | 6 ++++--
arch/x86/kernel/process_64.c | 5 ++---
include/asm-x86/system.h | 6 +++++-
kernel/panic.c | 2 ++
6 files changed, 24 insertions(+), 9 deletions(-)

Index: linux-x86.q/Makefile
===================================================================
--- linux-x86.q.orig/Makefile
+++ linux-x86.q/Makefile
@@ -507,6 +507,9 @@ else
KBUILD_CFLAGS += -O2
endif

+# Force gcc to behave correct even for buggy distributions
+KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
+
include $(srctree)/arch/$(SRCARCH)/Makefile

ifdef CONFIG_FRAME_POINTER
@@ -525,9 +528,6 @@ ifdef CONFIG_DEBUG_SECTION_MISMATCH
KBUILD_CFLAGS += $(call cc-option, -fno-inline-functions-called-once)
endif

-# Force gcc to behave correct even for buggy distributions
-KBUILD_CFLAGS += $(call cc-option, -fno-stack-protector)
-
# arch Makefile may override CC so keep this after arch Makefile is included
NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include)
CHECKFLAGS += $(NOSTDINC_FLAGS)
Index: linux-x86.q/arch/x86/kernel/Makefile
===================================================================
--- linux-x86.q.orig/arch/x86/kernel/Makefile
+++ linux-x86.q/arch/x86/kernel/Makefile
@@ -8,6 +8,14 @@ extra-$(CONFIG_X86_64) += head64.o
CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
CFLAGS_vsyscall_64.o := $(PROFILING) -g0

+#
+# Vsyscalls (which work on the user stack) should have
+# no stack-protector checks:
+#
+CFLAGS_vsyscall_64.o := $(PROFILING) -g0 -fno-stack-protector
+CFLAGS_hpet.o := -fno-stack-protector
+CFLAGS_tsc_64.o := -fno-stack-protector
+
obj-y := process_$(BITS).o signal_$(BITS).o entry_$(BITS).o
obj-y += traps_$(BITS).o irq_$(BITS).o
obj-y += time_$(BITS).o ioport.o ldt.o
Index: linux-x86.q/arch/x86/kernel/entry_64.S
===================================================================
--- linux-x86.q.orig/arch/x86/kernel/entry_64.S
+++ linux-x86.q/arch/x86/kernel/entry_64.S
@@ -453,6 +453,7 @@ ENTRY(stub_execve)
CFI_REGISTER rip, r11
SAVE_REST
FIXUP_TOP_OF_STACK %r11
+ movq %rsp, %rcx
call sys_execve
RESTORE_TOP_OF_STACK %r11
movq %rax,RAX(%rsp)
@@ -1036,15 +1037,16 @@ ENDPROC(child_rip)
* rdi: name, rsi: argv, rdx: envp
*
* We want to fallback into:
- * extern long sys_execve(char *name, char **argv,char **envp, struct pt_regs regs)
+ * extern long sys_execve(char *name, char **argv,char **envp, struct pt_regs *regs)
*
* do_sys_execve asm fallback arguments:
- * rdi: name, rsi: argv, rdx: envp, fake frame on the stack
+ * rdi: name, rsi: argv, rdx: envp, rcx: fake frame on the stack
*/
ENTRY(kernel_execve)
CFI_STARTPROC
FAKE_STACK_FRAME $0
SAVE_ALL
+ movq %rsp,%rcx
call sys_execve
movq %rax, RAX(%rsp)
RESTORE_REST
Index: linux-x86.q/arch/x86/kernel/process_64.c
===================================================================
--- linux-x86.q.orig/arch/x86/kernel/process_64.c
+++ linux-x86.q/arch/x86/kernel/process_64.c
@@ -676,7 +676,6 @@ __switch_to(struct task_struct *prev_p,
write_pda(kernelstack,
(unsigned long)task_stack_page(next_p) + THREAD_SIZE - PDA_STACKOFFSET);
#ifdef CONFIG_CC_STACKPROTECTOR
- write_pda(stack_canary, next_p->stack_canary);
/*
* Build time only check to make sure the stack_canary is at
* offset 40 in the pda; this is a gcc ABI requirement
@@ -705,7 +704,7 @@ __switch_to(struct task_struct *prev_p,
*/
asmlinkage
long sys_execve(char __user *name, char __user * __user *argv,
- char __user * __user *envp, struct pt_regs regs)
+ char __user * __user *envp, struct pt_regs *regs)
{
long error;
char * filename;
@@ -714,7 +713,7 @@ long sys_execve(char __user *name, char
error = PTR_ERR(filename);
if (IS_ERR(filename))
return error;
- error = do_execve(filename, argv, envp, &regs);
+ error = do_execve(filename, argv, envp, regs);
putname(filename);
return error;
}
Index: linux-x86.q/include/asm-x86/system.h
===================================================================
--- linux-x86.q.orig/include/asm-x86/system.h
+++ linux-x86.q/include/asm-x86/system.h
@@ -70,6 +70,8 @@ struct task_struct *__switch_to(struct t
".globl thread_return\n" \
"thread_return:\n\t" \
"movq %%gs:%P[pda_pcurrent],%%rsi\n\t" \
+ "movq %P[task_canary](%%rsi),%%r8\n\t" \
+ "movq %%r8,%%gs:%P[pda_canary]\n\t" \
"movq %P[thread_info](%%rsi),%%r8\n\t" \
LOCK_PREFIX "btr %[tif_fork],%P[ti_flags](%%r8)\n\t" \
"movq %%rax,%%rdi\n\t" \
@@ -81,7 +83,9 @@ struct task_struct *__switch_to(struct t
[ti_flags] "i" (offsetof(struct thread_info, flags)), \
[tif_fork] "i" (TIF_FORK), \
[thread_info] "i" (offsetof(struct task_struct, stack)), \
- [pda_pcurrent] "i" (offsetof(struct x8664_pda, pcurrent)) \
+ [task_canary] "i" (offsetof(struct task_struct, stack_canary)),\
+ [pda_pcurrent] "i" (offsetof(struct x8664_pda, pcurrent)), \
+ [pda_canary] "i" (offsetof(struct x8664_pda, stack_canary))\
: "memory", "cc" __EXTRA_CLOBBER)
#endif

Index: linux-x86.q/kernel/panic.c
===================================================================
--- linux-x86.q.orig/kernel/panic.c
+++ linux-x86.q/kernel/panic.c
@@ -323,6 +323,8 @@ EXPORT_SYMBOL(warn_on_slowpath);
*/
void __stack_chk_fail(void)
{
+ print_symbol("stack corrupted in: %s\n", (unsigned long)__builtin_return_address(0));
+ dump_stack();
panic("stack-protector: Kernel stack is corrupted");
}
EXPORT_SYMBOL(__stack_chk_fail);

2008-02-14 07:30:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


* Ingo Molnar <[email protected]> wrote:

> > was removed from arch/x86/kernel/process_64.c:__switch_to? that's
> > the only reason i can think of that would trigger this trace.
>
> I hand-ported your fixes [the patch was whitespace damaged] so i'm
> quite sure i got every bit of it - but find it below for reference. I
> think the percpu changes in .25 might have interfered somewhere. Will
> investigate.

ok, Arjan found the bug: it was that idle threads didnt have their
canary set up right.

[ note that this is still not complete because the initial idle thread
still has a zero canary. But it at least boots now. ]

Ingo

------------------------->
Subject: x86: setup stack canary for the idle threads
From: Arjan van de Ven <[email protected]>

The idle threads for non-boot CPUs are a bit special in how they
are created; the result is that these don't have the stack canary
set up properly in their PDA. Easiest fix is to just always set
the PDA up correctly when entering the idle thread; this is a NOP
for the boot cpu.

Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/process_64.c | 9 +++++++++
1 file changed, 9 insertions(+)

Index: linux-x86.q/arch/x86/kernel/process_64.c
===================================================================
--- linux-x86.q.orig/arch/x86/kernel/process_64.c
+++ linux-x86.q/arch/x86/kernel/process_64.c
@@ -166,6 +166,15 @@ static inline void play_dead(void)
void cpu_idle(void)
{
current_thread_info()->status |= TS_POLLING;
+
+#ifdef CONFIG_CC_STACKPROTECTOR
+ /*
+ * If we're the non-boot CPU, nothing set the PDA stack
+ * canary up for us. This is as good a place as any for
+ * doing that.
+ */
+ write_pda(stack_canary, current->stack_canary);
+#endif
/* endless idle loop with no priority at all */
while (1) {
tick_nohz_stop_sched_tick();

2008-02-14 07:43:52

by Sam Ravnborg

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

> --- linux-2.6.24.2/arch/x86/kernel/Makefile_64 2008-01-24 23:58:37.000000000
> +0100
> +++ linux-2.6.24.2-pax/arch/x86/kernel/Makefile_64 2008-02-13
> 11:36:14.000000000 +0100
> @@ -42,4 +42,6 @@ obj-$(CONFIG_PCI) += early-quirks.o
> obj-y += topology.o
> obj-y += pcspeaker.o
>
> -CFLAGS_vsyscall_64.o := $(PROFILING) -g0
> +CFLAGS_vsyscall_64.o := $(PROFILING) -g0 -fno-stack-protector
> +CFLAGS_hpet.o := -fno-stack-protector
> +CFLAGS_tsc_64.o := -fno-stack-protector

We should use:
nostackp := $(call cc-option, -fno-stack-protector)
+CFLAGS_vsyscall_64.o := $(PROFILING) -g0 $(nostackp)
+CFLAGS_hpet.o := $(nostackp)
+CFLAGS_tsc_64.o := $(nostackp)

(or somthing similar)

because we cannot rely on that all gcc versions has support
for -fno-stack-protector

Sam

2008-02-14 11:24:40

by PaX Team

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles

On 14 Feb 2008 at 8:30, Ingo Molnar wrote:

> --- linux-x86.q.orig/arch/x86/kernel/process_64.c
> +++ linux-x86.q/arch/x86/kernel/process_64.c
> @@ -166,6 +166,15 @@ static inline void play_dead(void)
> void cpu_idle(void)
> {
> current_thread_info()->status |= TS_POLLING;
> +
> +#ifdef CONFIG_CC_STACKPROTECTOR
> + /*
> + * If we're the non-boot CPU, nothing set the PDA stack
> + * canary up for us. This is as good a place as any for
> + * doing that.
> + */
> + write_pda(stack_canary, current->stack_canary);
> +#endif

i wonder if these #ifdef's are really needed at all, even if one doesn't
use -fstack-protector, having the code set up the canary has like 0
performance impact. not to mention that i think the change in switch_to
means that it won't compile without CONFIG_CC_STACKPROTECTOR enabled and
instead of making that macro conditional it's just a lot easier to enable
the canary all the time.

2008-02-14 12:21:16

by Jan Engelhardt

[permalink] [raw]
Subject: Re: vmsplice exploits, stack protector and Makefiles


On Feb 13 2008 17:48, [email protected] wrote:
>On 13 Feb 2008 at 8:29, Randy Dunlap wrote:
>
>> Is it signed-off-by: pageexec ?
>
>no it isn't, on purpose as i won't give out my real name that the
>DCO requires.

But could still add "Brought-to-attention-by: pageexec@.." or
something like that.