2018-08-30 19:50:40

by Jann Horn

[permalink] [raw]
Subject: [PATCH] x86/dumpstack: fix address space casting in show_opcodes()

I sloppily passed a kernel-typed pointer to __range_not_ok(), and sparse
doesn't like that.
Make `prologue` a __user pointer (to protect against accidental
dereferences) and force-cast it to a kernel pointer when calling
probe_kernel_read(), which will then immediately force-cast it back to a
user pointer.

Fixes: a644cf538b11 ("x86/dumpstack: Don't dump kernel memory based on usermode RIP")
Signed-off-by: Jann Horn <[email protected]>
---
arch/x86/kernel/dumpstack.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 605c60b1624f..651aed36291a 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -96,7 +96,7 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
#define EPILOGUE_SIZE 21
#define OPCODE_BUFSIZE (PROLOGUE_SIZE + 1 + EPILOGUE_SIZE)
u8 opcodes[OPCODE_BUFSIZE];
- u8 *prologue = (u8 *)(regs->ip - PROLOGUE_SIZE);
+ u8 __user *prologue = (u8 __user *)(regs->ip - PROLOGUE_SIZE);
bool bad_ip;

/*
@@ -106,7 +106,8 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
bad_ip = user_mode(regs) &&
__range_not_ok(prologue, OPCODE_BUFSIZE, TASK_SIZE_MAX);

- if (bad_ip || probe_kernel_read(opcodes, prologue, OPCODE_BUFSIZE)) {
+ if (bad_ip || probe_kernel_read(opcodes, (__force u8 *)prologue,
+ OPCODE_BUFSIZE)) {
printk("%sCode: Bad RIP value.\n", loglvl);
} else {
printk("%sCode: %" __stringify(PROLOGUE_SIZE) "ph <%02x> %"
--
2.19.0.rc0.228.g281dcd1b4d0-goog



2018-08-31 08:05:40

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/dumpstack: fix address space casting in show_opcodes()

On Thu, Aug 30, 2018 at 09:47:36PM +0200, Jann Horn wrote:
> I sloppily passed a kernel-typed pointer to __range_not_ok(), and sparse
> doesn't like that.
> Make `prologue` a __user pointer (to protect against accidental
> dereferences) and force-cast it to a kernel pointer when calling
> probe_kernel_read(), which will then immediately force-cast it back to a
> user pointer.

Yeah, that's some crazy casting.

Can we define a local __user pointer only for the check instead? It is
less casting and looks simpler and actually even easier to understand
what we're doing...

---
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 605c60b1624f..9c5a15491108 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -97,14 +97,17 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
#define OPCODE_BUFSIZE (PROLOGUE_SIZE + 1 + EPILOGUE_SIZE)
u8 opcodes[OPCODE_BUFSIZE];
u8 *prologue = (u8 *)(regs->ip - PROLOGUE_SIZE);
- bool bad_ip;
+ bool bad_ip = false;

/*
* Make sure userspace isn't trying to trick us into dumping kernel
* memory by pointing the userspace instruction pointer at it.
*/
- bad_ip = user_mode(regs) &&
- __range_not_ok(prologue, OPCODE_BUFSIZE, TASK_SIZE_MAX);
+ if (user_mode(regs)) {
+ u8 __user *up = (u8 __user *)prologue;
+
+ bad_ip = __range_not_ok(up, OPCODE_BUFSIZE, TASK_SIZE_MAX);
+ }

if (bad_ip || probe_kernel_read(opcodes, prologue, OPCODE_BUFSIZE)) {
printk("%sCode: Bad RIP value.\n", loglvl);

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

2018-08-31 09:27:01

by Luc Van Oostenryck

[permalink] [raw]
Subject: Re: [PATCH] x86/dumpstack: fix address space casting in show_opcodes()

On Thu, Aug 30, 2018 at 09:47:36PM +0200, Jann Horn wrote:
> I sloppily passed a kernel-typed pointer to __range_not_ok(), and sparse
> doesn't like that.
> Make `prologue` a __user pointer (to protect against accidental
> dereferences) and force-cast it to a kernel pointer when calling
> probe_kernel_read(), which will then immediately force-cast it back to a
> user pointer.

It's a bit sad to have to do this.
__range_not_ok() explicitly requires a __user pointer (I don't know
if there is a good reason for it) but the real job is done by
__chk_range_not_ok(). Can't you use this later instead?


-- Luc Van Oostenryck

2018-08-31 13:32:36

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] x86/dumpstack: fix address space casting in show_opcodes()

On Fri, Aug 31, 2018 at 10:27 AM Luc Van Oostenryck
<[email protected]> wrote:
>
> On Thu, Aug 30, 2018 at 09:47:36PM +0200, Jann Horn wrote:
> > I sloppily passed a kernel-typed pointer to __range_not_ok(), and sparse
> > doesn't like that.
> > Make `prologue` a __user pointer (to protect against accidental
> > dereferences) and force-cast it to a kernel pointer when calling
> > probe_kernel_read(), which will then immediately force-cast it back to a
> > user pointer.
>
> It's a bit sad to have to do this.
> __range_not_ok() explicitly requires a __user pointer (I don't know
> if there is a good reason for it) but the real job is done by
> __chk_range_not_ok(). Can't you use this later instead?

Yeah, I guess I can do that. Will send a v2 in a bit...


By the way, here are all 60 probe_kernel_read() callers:

arch/arm/kernel/ftrace.c: if
(probe_kernel_read(&replaced, (void *)pc, MCOUNT_INSN_SIZE))
arch/arm/kernel/kgdb.c: err = probe_kernel_read(bpt->saved_instr,
(char *)bpt->bpt_addr,
arch/arm64/kernel/insn.c: ret = probe_kernel_read(&val, addr,
AARCH64_INSN_SIZE);
arch/ia64/kernel/ftrace.c: if (probe_kernel_read(replaced, (void
*)ip, MCOUNT_INSN_SIZE))
arch/ia64/kernel/ftrace.c: if (probe_kernel_read(replaced, (void
*)ip, MCOUNT_INSN_SIZE))
arch/mips/kernel/kprobes.c: if ((probe_kernel_read(&prev_insn, p->addr - 1,
arch/powerpc/kernel/module_64.c: if (probe_kernel_read(&magic,
&stub->magic, sizeof(magic))) {
arch/powerpc/kernel/module_64.c: if
(probe_kernel_read(&funcdata, &stub->funcdata, sizeof(funcdata))) {
arch/powerpc/kernel/trace/ftrace.c: if
(probe_kernel_read(&replaced, (void *)ip, MCOUNT_INSN_SIZE))
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(&op,
(void *)ip, sizeof(int))) {
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(&op,
(void *)(ip - 4), 4)) {
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(&op,
(void *)(ip+4), MCOUNT_INSN_SIZE)) {
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(&op,
(void *)ip, MCOUNT_INSN_SIZE))
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(jmp,
(void *)tramp, sizeof(jmp))) {
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(op, ip,
sizeof(op)))
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(&op,
(void *)ip, MCOUNT_INSN_SIZE))
arch/powerpc/kernel/trace/ftrace.c: if (probe_kernel_read(&op,
(void *)ip, sizeof(int))) {
arch/powerpc/perf/core-book3s.c: if
(probe_kernel_read(&instr, (void *)addr, sizeof(instr)))
arch/riscv/kernel/ftrace.c: if (probe_kernel_read(replaced, (void
*)hook_pos, MCOUNT_INSN_SIZE))
arch/s390/kernel/ftrace.c: if (probe_kernel_read(&old, (void *)
rec->ip, sizeof(old)))
arch/s390/kernel/ftrace.c: if (probe_kernel_read(&old, (void *)
rec->ip, sizeof(old)))
arch/sh/kernel/ftrace.c: if (probe_kernel_read(replaced, (void
*)ip, MCOUNT_INSN_SIZE))
arch/sh/kernel/ftrace.c: if (probe_kernel_read(code, (void
*)ip, MCOUNT_INSN_SIZE))
arch/x86/kernel/dumpstack.c: if (probe_kernel_read(opcodes, rip -
PROLOGUE_SIZE, OPCODE_BUFSIZE)) {
arch/x86/kernel/ftrace.c: if (probe_kernel_read(replaced, (void
*)ip, MCOUNT_INSN_SIZE))
arch/x86/kernel/ftrace.c: if (probe_kernel_read(replaced, (void
*)ip, MCOUNT_INSN_SIZE))
arch/x86/kernel/ftrace.c: if (probe_kernel_read(ins, (void *)ip,
MCOUNT_INSN_SIZE))
arch/x86/kernel/ftrace.c: ret = probe_kernel_read(trampoline,
(void *)start_offset, size);
arch/x86/kernel/ftrace.c: ret = probe_kernel_read(&calc, ptr,
MCOUNT_INSN_SIZE);
arch/x86/kernel/kgdb.c: err = probe_kernel_read(bpt->saved_instr,
(char *)bpt->bpt_addr,
arch/x86/kernel/kgdb.c: err = probe_kernel_read(opc, (char
*)bpt->bpt_addr, BREAK_INSTR_SIZE);
arch/x86/kernel/kgdb.c: err = probe_kernel_read(opc, (char
*)bpt->bpt_addr, BREAK_INSTR_SIZE);
arch/x86/kernel/kprobes/core.c: if (probe_kernel_read(buf, (void *)addr,
arch/x86/kernel/kprobes/core.c: if (probe_kernel_read(dest, (void
*)recovered_insn, MAX_INSN_SIZE))
arch/x86/kernel/kprobes/opt.c: if (probe_kernel_read(buf, (void *)addr,
arch/x86/xen/enlighten_pv.c: probe_kernel_read(&dummy, v, 1);
drivers/char/mem.c: probe =
probe_kernel_read(bounce, ptr, sz);
drivers/dio/dio.c: if (probe_kernel_read(&i, (unsigned
char *)va + DIO_IDOFF, 1)) {
drivers/dio/dio.c: if (probe_kernel_read(&i, (unsigned
char *)va + DIO_IDOFF, 1)) {
drivers/input/serio/hp_sdc.c: if (!probe_kernel_read(&i, (unsigned
char *)hp_sdc.data_io, 1))
drivers/misc/kgdbts.c: probe_kernel_read(before, (char *)kgdbts_break_test,
drivers/misc/kgdbts.c: probe_kernel_read(after, (char *)kgdbts_break_test,
drivers/video/fbdev/hpfb.c: err = probe_kernel_read(&i, (unsigned
char *)INTFBVADDR + DIO_IDOFF, 1);
fs/proc/kcore.c: if
(probe_kernel_read(buf, (void *) start, tsz)) {
include/linux/uaccess.h: probe_kernel_read(&retval, addr, sizeof(retval))
kernel/debug/debug_core.c: err =
probe_kernel_read(bpt->saved_instr, (char *)bpt->bpt_addr,
kernel/debug/gdbstub.c: err = probe_kernel_read(tmp, mem, count);
kernel/debug/kdb/kdb_main.c: if (!p || probe_kernel_read(&tmp,
(char *)p, sizeof(unsigned long)))
kernel/debug/kdb/kdb_support.c: int ret = probe_kernel_read((char
*)res, (char *)addr, size);
kernel/debug/kdb/kdb_support.c: int ret = probe_kernel_read((char
*)addr, (char *)res, size);
kernel/debug/kdb/kdb_support.c: if (!p || probe_kernel_read(&tmp,
(char *)p, sizeof(unsigned long)))
kernel/kthread.c: probe_kernel_read(&data, &kthread->data, sizeof(data));
kernel/trace/bpf_trace.c: ret = probe_kernel_read(dst, unsafe_ptr, size);
kernel/workqueue.c: probe_kernel_read(&fn, &worker->current_func,
sizeof(fn));
kernel/workqueue.c: probe_kernel_read(&pwq, &worker->current_pwq,
sizeof(pwq));
kernel/workqueue.c: probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
kernel/workqueue.c: probe_kernel_read(name, wq->name, sizeof(name) - 1);
kernel/workqueue.c: probe_kernel_read(desc, worker->desc, sizeof(desc) - 1);
mm/slab.c: if (probe_kernel_read(&v, dbg_userword(c, p),
sizeof(v)))
mm/slub.c: probe_kernel_read(&p, (void **)freepointer_addr, sizeof(p));

41 of these (or something like that, I counted by hand) have some sort
of cast in the call expression.
probe_kernel_read() is kinda special in that expected types for the
second argument are both kernel pointers and unsigned longs. It might
make sense to have a wrapper macro around probe_kernel_read() that
accepts anything as long as it's as wide as a pointer... maybe
something for a future refactor.

2018-08-31 16:39:24

by Luc Van Oostenryck

[permalink] [raw]
Subject: Re: [PATCH] x86/dumpstack: fix address space casting in show_opcodes()

On Fri, Aug 31, 2018 at 03:26:24PM +0200, Jann Horn wrote:
>
> By the way, here are all 60 probe_kernel_read() callers:

...

> 41 of these (or something like that, I counted by hand) have some sort
> of cast in the call expression.
> probe_kernel_read() is kinda special in that expected types for the
> second argument are both kernel pointers and unsigned longs. It might
> make sense to have a wrapper macro around probe_kernel_read() that
> accepts anything as long as it's as wide as a pointer...

Well, if __user pointers should not be accepted, then it's much
better, typewisely, to leave it as const void * (and cast the
ulong args to some plain generic pointer, like void* or u8*).

-- Luc Van Oostenryck