To support evaluating 64-bit kernel mode instructions:
Replace existing checks for user_64bit_mode() with a new helper that
checks whether code is being executed in either 64-bit kernel mode or
64-bit user mode.
Select the GS base depending on whether the instruction is being
evaluated in kernel mode.
Signed-off-by: Jann Horn <[email protected]>
---
Notes:
v2-v7:
no changes
arch/x86/include/asm/ptrace.h | 13 +++++++++++++
arch/x86/lib/insn-eval.c | 26 +++++++++++++++-----------
2 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 5057a8ed100b..ac45b06941a5 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -159,6 +159,19 @@ static inline bool user_64bit_mode(struct pt_regs *regs)
#endif
}
+/*
+ * Determine whether the register set came from any context that is running in
+ * 64-bit mode.
+ */
+static inline bool any_64bit_mode(struct pt_regs *regs)
+{
+#ifdef CONFIG_X86_64
+ return !user_mode(regs) || user_64bit_mode(regs);
+#else
+ return false;
+#endif
+}
+
#ifdef CONFIG_X86_64
#define current_user_stack_pointer() current_pt_regs()->sp
#define compat_user_stack_pointer() current_pt_regs()->sp
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 306c3a0902ba..31600d851fd8 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -155,7 +155,7 @@ static bool check_seg_overrides(struct insn *insn, int regoff)
*/
static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
{
- if (user_64bit_mode(regs))
+ if (any_64bit_mode(regs))
return INAT_SEG_REG_IGNORE;
/*
* Resolve the default segment register as described in Section 3.7.4
@@ -266,7 +266,7 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
* which may be invalid at this point.
*/
if (regoff == offsetof(struct pt_regs, ip)) {
- if (user_64bit_mode(regs))
+ if (any_64bit_mode(regs))
return INAT_SEG_REG_IGNORE;
else
return INAT_SEG_REG_CS;
@@ -289,7 +289,7 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
* In long mode, segment override prefixes are ignored, except for
* overrides for FS and GS.
*/
- if (user_64bit_mode(regs)) {
+ if (any_64bit_mode(regs)) {
if (idx != INAT_SEG_REG_FS &&
idx != INAT_SEG_REG_GS)
idx = INAT_SEG_REG_IGNORE;
@@ -646,23 +646,27 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
*/
return (unsigned long)(sel << 4);
- if (user_64bit_mode(regs)) {
+ if (any_64bit_mode(regs)) {
/*
* Only FS or GS will have a base address, the rest of
* the segments' bases are forced to 0.
*/
unsigned long base;
- if (seg_reg_idx == INAT_SEG_REG_FS)
+ if (seg_reg_idx == INAT_SEG_REG_FS) {
rdmsrl(MSR_FS_BASE, base);
- else if (seg_reg_idx == INAT_SEG_REG_GS)
+ } else if (seg_reg_idx == INAT_SEG_REG_GS) {
/*
* swapgs was called at the kernel entry point. Thus,
* MSR_KERNEL_GS_BASE will have the user-space GS base.
*/
- rdmsrl(MSR_KERNEL_GS_BASE, base);
- else
+ if (user_mode(regs))
+ rdmsrl(MSR_KERNEL_GS_BASE, base);
+ else
+ rdmsrl(MSR_GS_BASE, base);
+ } else {
base = 0;
+ }
return base;
}
@@ -703,7 +707,7 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
if (sel < 0)
return 0;
- if (user_64bit_mode(regs) || v8086_mode(regs))
+ if (any_64bit_mode(regs) || v8086_mode(regs))
return -1L;
if (!sel)
@@ -948,7 +952,7 @@ static int get_eff_addr_modrm(struct insn *insn, struct pt_regs *regs,
* following instruction.
*/
if (*regoff == -EDOM) {
- if (user_64bit_mode(regs))
+ if (any_64bit_mode(regs))
tmp = regs->ip + insn->length;
else
tmp = 0;
@@ -1250,7 +1254,7 @@ static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
* After computed, the effective address is treated as an unsigned
* quantity.
*/
- if (!user_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit))
+ if (!any_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit))
goto out;
/*
--
2.24.1.735.g03f4e72817-goog
A frequent cause of #GP exceptions are memory accesses to non-canonical
addresses. Unlike #PF, #GP doesn't come with a fault address in CR2, so
the kernel doesn't currently print the fault address for #GP.
Luckily, we already have the necessary infrastructure for decoding X86
instructions and computing the memory address that is being accessed;
hook it up to the #GP handler so that we can figure out the address
operand of the faulting instruction and print it.
Distinguish two cases:
a) (Part of) the memory range being accessed lies in the non-canonical
address range; in this case, is is likely that the address we
decoded is actually the one that caused the #GP.
b) The entire memory range of the operand we decoded lies in canonical
address space; the #GP may or may not be related in some way to the
address we computed. We'll still print it, but with hedging
language in the message.
While it is already possible to compute the faulting address manually by
disassembling the opcode dump and evaluating the instruction against the
register dump, this should make it slightly easier to identify crashes
at a glance.
Note that the operand length, which we get from the instruction decoder
and use to determine whether the access straddles into non-canonical
address space, is currently somewhat unreliable; but it should be good
enough, considering that Linux on x86-64 never maps the page directly
before the start of the non-canonical range anyway, and therefore the
case where a memory range begins in that page and potentially straddles
into the non-canonical range should be fairly uncommon.
And if we do get this wrong, it only influences whether the error
message claims that the access is canonical.
Reviewed-and-tested-by: Sean Christopherson <[email protected]>
Signed-off-by: Jann Horn <[email protected]>
---
Notes:
v2:
- print different message for segment-related GP (Borislav)
- rewrite check for non-canonical address (Sean)
- make it clear we don't know for sure why the GP happened (Andy)
v3:
- change message format to one line (Borislav)
v4:
- rename insn_bytes to insn_buf (Ingo)
- add space after GPFSTR (Ingo)
- make sizeof(desc) clearer (Ingo, Borislav)
- also print the address (with a different message) if it's canonical (Ingo)
v5:
- reword comment on get_kernel_gp_address() (Sean)
- make get_kernel_gp_address() also work on 32-bit (Sean)
- minor nits (Sean)
- more hedging for canonical GP (Sean)
- let get_kernel_gp_address() return an enum (Sean)
- rewrite commit message
v6:
- add comma after GPFSTR (Sean)
- reorder variable declarations (Sean)
v7:
no changes
I have already sent a patch to syzkaller that relaxes their parsing of GPF
messages (https://github.com/google/syzkaller/commit/432c7650) such that
changes like the one in this patch don't break it.
That patch has already made its way into syzbot's syzkaller instances
according to <https://syzkaller.appspot.com/upstream>.
arch/x86/kernel/traps.c | 70 +++++++++++++++++++++++++++++++++++++++--
1 file changed, 67 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index f19de6f45d48..c8b4ae6aed5b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -56,6 +56,8 @@
#include <asm/mpx.h>
#include <asm/vm86.h>
#include <asm/umip.h>
+#include <asm/insn.h>
+#include <asm/insn-eval.h>
#ifdef CONFIG_X86_64
#include <asm/x86_init.h>
@@ -518,10 +520,55 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, 0, NULL);
}
+enum kernel_gp_hint {
+ GP_NO_HINT,
+ GP_NON_CANONICAL,
+ GP_CANONICAL
+};
+
+/*
+ * When an uncaught #GP occurs, try to determine a memory address accessed by
+ * the instruction and return that address to the caller.
+ * Also try to figure out whether any part of the access to that address was
+ * non-canonical.
+ */
+static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
+ unsigned long *addr)
+{
+ u8 insn_buf[MAX_INSN_SIZE];
+ struct insn insn;
+
+ if (probe_kernel_read(insn_buf, (void *)regs->ip, MAX_INSN_SIZE))
+ return GP_NO_HINT;
+
+ kernel_insn_init(&insn, insn_buf, MAX_INSN_SIZE);
+ insn_get_modrm(&insn);
+ insn_get_sib(&insn);
+ *addr = (unsigned long)insn_get_addr_ref(&insn, regs);
+
+ if (*addr == -1UL)
+ return GP_NO_HINT;
+
+#ifdef CONFIG_X86_64
+ /*
+ * Check that:
+ * - the operand is not in the kernel half
+ * - the last byte of the operand is not in the user canonical half
+ */
+ if (*addr < ~__VIRTUAL_MASK &&
+ *addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
+ return GP_NON_CANONICAL;
+#endif
+
+ return GP_CANONICAL;
+}
+
+#define GPFSTR "general protection fault"
+
dotraplinkage void
do_general_protection(struct pt_regs *regs, long error_code)
{
- const char *desc = "general protection fault";
+ char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
struct task_struct *tsk;
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
@@ -540,6 +587,9 @@ do_general_protection(struct pt_regs *regs, long error_code)
tsk = current;
if (!user_mode(regs)) {
+ enum kernel_gp_hint hint = GP_NO_HINT;
+ unsigned long gp_addr;
+
if (fixup_exception(regs, X86_TRAP_GP, error_code, 0))
return;
@@ -556,8 +606,22 @@ do_general_protection(struct pt_regs *regs, long error_code)
return;
if (notify_die(DIE_GPF, desc, regs, error_code,
- X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP)
- die(desc, regs, error_code);
+ X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP)
+ return;
+
+ if (error_code)
+ snprintf(desc, sizeof(desc), "segment-related " GPFSTR);
+ else
+ hint = get_kernel_gp_address(regs, &gp_addr);
+
+ if (hint != GP_NO_HINT)
+ snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
+ (hint == GP_NON_CANONICAL) ?
+ "probably for non-canonical address" :
+ "maybe for address",
+ gp_addr);
+
+ die(desc, regs, error_code);
return;
}
--
2.24.1.735.g03f4e72817-goog
Split __die() into __die_header() and __die_body(). This allows inserting
extra information below the header line that initiates the bug report.
Introduce a new function die_addr() that behaves like die(), but is for
faults only and uses __die_header()+__die_body() so that a future commit
can print extra information after the header line.
Signed-off-by: Jann Horn <[email protected]>
---
Notes:
v3:
new patch
v4-v6:
no changes
v7:
- introduce die_addr() instead of open-coding __die_header()
and __die_body() calls in traps.c (Borislav)
- make __die_header() and __die_body() static
- rewrite commit message
arch/x86/include/asm/kdebug.h | 1 +
arch/x86/kernel/dumpstack.c | 24 +++++++++++++++++++++++-
arch/x86/kernel/traps.c | 5 ++++-
3 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kdebug.h b/arch/x86/include/asm/kdebug.h
index 75f1e35e7c15..247ab14c6309 100644
--- a/arch/x86/include/asm/kdebug.h
+++ b/arch/x86/include/asm/kdebug.h
@@ -33,6 +33,7 @@ enum show_regs_mode {
};
extern void die(const char *, struct pt_regs *,long);
+void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr);
extern int __must_check __die(const char *, struct pt_regs *, long);
extern void show_stack_regs(struct pt_regs *regs);
extern void __show_regs(struct pt_regs *regs, enum show_regs_mode);
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index e07424e19274..8995bf10c97c 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -365,7 +365,7 @@ void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
}
NOKPROBE_SYMBOL(oops_end);
-int __die(const char *str, struct pt_regs *regs, long err)
+static void __die_header(const char *str, struct pt_regs *regs, long err)
{
const char *pr = "";
@@ -384,7 +384,11 @@ int __die(const char *str, struct pt_regs *regs, long err)
IS_ENABLED(CONFIG_KASAN) ? " KASAN" : "",
IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION) ?
(boot_cpu_has(X86_FEATURE_PTI) ? " PTI" : " NOPTI") : "");
+}
+NOKPROBE_SYMBOL(__die_header);
+static int __die_body(const char *str, struct pt_regs *regs, long err)
+{
show_regs(regs);
print_modules();
@@ -394,6 +398,13 @@ int __die(const char *str, struct pt_regs *regs, long err)
return 0;
}
+NOKPROBE_SYMBOL(__die_body);
+
+int __die(const char *str, struct pt_regs *regs, long err)
+{
+ __die_header(str, regs, err);
+ return __die_body(str, regs, err);
+}
NOKPROBE_SYMBOL(__die);
/*
@@ -410,6 +421,17 @@ void die(const char *str, struct pt_regs *regs, long err)
oops_end(flags, regs, sig);
}
+void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr)
+{
+ unsigned long flags = oops_begin();
+ int sig = SIGSEGV;
+
+ __die_header(str, regs, err);
+ if (__die_body(str, regs, err))
+ sig = 0;
+ oops_end(flags, regs, sig);
+}
+
void show_regs(struct pt_regs *regs)
{
show_regs_print_info(KERN_DEFAULT);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index c8b4ae6aed5b..4c691bb9e0d9 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -621,7 +621,10 @@ do_general_protection(struct pt_regs *regs, long error_code)
"maybe for address",
gp_addr);
- die(desc, regs, error_code);
+ if (hint != GP_NON_CANONICAL)
+ gp_addr = 0;
+
+ die_addr(desc, regs, error_code, gp_addr);
return;
}
--
2.24.1.735.g03f4e72817-goog
Make #GP exceptions caused by out-of-bounds KASAN shadow accesses easier
to understand by computing the address of the original access and
printing that. More details are in the comments in the patch.
This turns an error like this:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault, probably for non-canonical address
0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
into this:
general protection fault, probably for non-canonical address
0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: maybe wild-memory-access in range
[0x00badbeefbadbee8-0x00badbeefbadbeef]
The hook is placed in architecture-independent code, but is currently
only wired up to the X86 exception handler because I'm not sufficiently
familiar with the address space layout and exception handling mechanisms
on other architectures.
Signed-off-by: Jann Horn <[email protected]>
---
Notes:
v2:
- move to mm/kasan/report.c (Dmitry)
- change hook name to be more generic
- use TASK_SIZE instead of TASK_SIZE_MAX for compiling on non-x86
- don't open-code KASAN_SHADOW_MASK (Dmitry)
- add "KASAN: " prefix, but not "BUG: " (Andrey, Dmitry)
- use same naming scheme as get_wild_bug_type (Andrey)
- this version was "Reviewed-by: Dmitry Vyukov <[email protected]>"
v3:
- adjusted example output in commit message based on
changes in preceding patch
- ensure that KASAN output happens after bust_spinlocks(1)
- moved hook in arch/x86/kernel/traps.c such that output
appears after the first line of KASAN-independent error report
v4:
- adjust patch to changes in x86/traps patch
v5:
- adjust patch to changes in x86/traps patch
- fix bug introduced in v3: remove die() call after oops_end()
v6:
- adjust sample output in commit message
v7:
- instead of open-coding __die_header()+__die_body() in traps.c,
insert a hook call into die_body(), introduced in patch 3/4
(Borislav)
arch/x86/kernel/dumpstack.c | 2 ++
arch/x86/mm/kasan_init_64.c | 21 -------------------
include/linux/kasan.h | 6 ++++++
mm/kasan/report.c | 40 +++++++++++++++++++++++++++++++++++++
4 files changed, 48 insertions(+), 21 deletions(-)
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 8995bf10c97c..ae64ec7f752f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -427,6 +427,8 @@ void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr)
int sig = SIGSEGV;
__die_header(str, regs, err);
+ if (gp_addr)
+ kasan_non_canonical_hook(gp_addr);
if (__die_body(str, regs, err))
sig = 0;
oops_end(flags, regs, sig);
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index cf5bc37c90ac..763e71abc0fe 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -288,23 +288,6 @@ static void __init kasan_shallow_populate_pgds(void *start, void *end)
} while (pgd++, addr = next, addr != (unsigned long)end);
}
-#ifdef CONFIG_KASAN_INLINE
-static int kasan_die_handler(struct notifier_block *self,
- unsigned long val,
- void *data)
-{
- if (val == DIE_GPF) {
- pr_emerg("CONFIG_KASAN_INLINE enabled\n");
- pr_emerg("GPF could be caused by NULL-ptr deref or user memory access\n");
- }
- return NOTIFY_OK;
-}
-
-static struct notifier_block kasan_die_notifier = {
- .notifier_call = kasan_die_handler,
-};
-#endif
-
void __init kasan_early_init(void)
{
int i;
@@ -341,10 +324,6 @@ void __init kasan_init(void)
int i;
void *shadow_cpu_entry_begin, *shadow_cpu_entry_end;
-#ifdef CONFIG_KASAN_INLINE
- register_die_notifier(&kasan_die_notifier);
-#endif
-
memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
/*
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 4f404c565db1..e0238af0388f 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -225,4 +225,10 @@ static inline void kasan_release_vmalloc(unsigned long start,
unsigned long free_region_end) {}
#endif
+#ifdef CONFIG_KASAN_INLINE
+void kasan_non_canonical_hook(unsigned long addr);
+#else /* CONFIG_KASAN_INLINE */
+static inline void kasan_non_canonical_hook(unsigned long addr) { }
+#endif /* CONFIG_KASAN_INLINE */
+
#endif /* LINUX_KASAN_H */
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 621782100eaa..5ef9f24f566b 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -512,3 +512,43 @@ void __kasan_report(unsigned long addr, size_t size, bool is_write, unsigned lon
end_report(&flags);
}
+
+#ifdef CONFIG_KASAN_INLINE
+/*
+ * With CONFIG_KASAN_INLINE, accesses to bogus pointers (outside the high
+ * canonical half of the address space) cause out-of-bounds shadow memory reads
+ * before the actual access. For addresses in the low canonical half of the
+ * address space, as well as most non-canonical addresses, that out-of-bounds
+ * shadow memory access lands in the non-canonical part of the address space.
+ * Help the user figure out what the original bogus pointer was.
+ */
+void kasan_non_canonical_hook(unsigned long addr)
+{
+ unsigned long orig_addr;
+ const char *bug_type;
+
+ if (addr < KASAN_SHADOW_OFFSET)
+ return;
+
+ orig_addr = (addr - KASAN_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT;
+ /*
+ * For faults near the shadow address for NULL, we can be fairly certain
+ * that this is a KASAN shadow memory access.
+ * For faults that correspond to shadow for low canonical addresses, we
+ * can still be pretty sure - that shadow region is a fairly narrow
+ * chunk of the non-canonical address space.
+ * But faults that look like shadow for non-canonical addresses are a
+ * really large chunk of the address space. In that case, we still
+ * print the decoded address, but make it clear that this is not
+ * necessarily what's actually going on.
+ */
+ if (orig_addr < PAGE_SIZE)
+ bug_type = "null-ptr-deref";
+ else if (orig_addr < TASK_SIZE)
+ bug_type = "probably user-memory-access";
+ else
+ bug_type = "maybe wild-memory-access";
+ pr_alert("KASAN: %s in range [0x%016lx-0x%016lx]\n", bug_type,
+ orig_addr, orig_addr + KASAN_SHADOW_MASK);
+}
+#endif
--
2.24.1.735.g03f4e72817-goog
On Thu, Dec 19, 2019 at 12:12 AM Jann Horn <[email protected]> wrote:
>
> Make #GP exceptions caused by out-of-bounds KASAN shadow accesses easier
> to understand by computing the address of the original access and
> printing that. More details are in the comments in the patch.
>
> This turns an error like this:
>
> kasan: CONFIG_KASAN_INLINE enabled
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault, probably for non-canonical address
> 0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
>
> into this:
>
> general protection fault, probably for non-canonical address
> 0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
> KASAN: maybe wild-memory-access in range
> [0x00badbeefbadbee8-0x00badbeefbadbeef]
>
> The hook is placed in architecture-independent code, but is currently
> only wired up to the X86 exception handler because I'm not sufficiently
> familiar with the address space layout and exception handling mechanisms
> on other architectures.
>
> Signed-off-by: Jann Horn <[email protected]>
> ---
>
> Notes:
> v2:
> - move to mm/kasan/report.c (Dmitry)
> - change hook name to be more generic
> - use TASK_SIZE instead of TASK_SIZE_MAX for compiling on non-x86
> - don't open-code KASAN_SHADOW_MASK (Dmitry)
> - add "KASAN: " prefix, but not "BUG: " (Andrey, Dmitry)
> - use same naming scheme as get_wild_bug_type (Andrey)
> - this version was "Reviewed-by: Dmitry Vyukov <[email protected]>"
> v3:
> - adjusted example output in commit message based on
> changes in preceding patch
> - ensure that KASAN output happens after bust_spinlocks(1)
> - moved hook in arch/x86/kernel/traps.c such that output
> appears after the first line of KASAN-independent error report
> v4:
> - adjust patch to changes in x86/traps patch
> v5:
> - adjust patch to changes in x86/traps patch
> - fix bug introduced in v3: remove die() call after oops_end()
> v6:
> - adjust sample output in commit message
> v7:
> - instead of open-coding __die_header()+__die_body() in traps.c,
> insert a hook call into die_body(), introduced in patch 3/4
> (Borislav)
>
> arch/x86/kernel/dumpstack.c | 2 ++
> arch/x86/mm/kasan_init_64.c | 21 -------------------
> include/linux/kasan.h | 6 ++++++
> mm/kasan/report.c | 40 +++++++++++++++++++++++++++++++++++++
> 4 files changed, 48 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> index 8995bf10c97c..ae64ec7f752f 100644
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -427,6 +427,8 @@ void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr)
> int sig = SIGSEGV;
>
> __die_header(str, regs, err);
> + if (gp_addr)
> + kasan_non_canonical_hook(gp_addr);
> if (__die_body(str, regs, err))
> sig = 0;
> oops_end(flags, regs, sig);
> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
> index cf5bc37c90ac..763e71abc0fe 100644
> --- a/arch/x86/mm/kasan_init_64.c
> +++ b/arch/x86/mm/kasan_init_64.c
> @@ -288,23 +288,6 @@ static void __init kasan_shallow_populate_pgds(void *start, void *end)
> } while (pgd++, addr = next, addr != (unsigned long)end);
> }
>
> -#ifdef CONFIG_KASAN_INLINE
> -static int kasan_die_handler(struct notifier_block *self,
> - unsigned long val,
> - void *data)
> -{
> - if (val == DIE_GPF) {
> - pr_emerg("CONFIG_KASAN_INLINE enabled\n");
> - pr_emerg("GPF could be caused by NULL-ptr deref or user memory access\n");
> - }
> - return NOTIFY_OK;
> -}
> -
> -static struct notifier_block kasan_die_notifier = {
> - .notifier_call = kasan_die_handler,
> -};
> -#endif
> -
> void __init kasan_early_init(void)
> {
> int i;
> @@ -341,10 +324,6 @@ void __init kasan_init(void)
> int i;
> void *shadow_cpu_entry_begin, *shadow_cpu_entry_end;
>
> -#ifdef CONFIG_KASAN_INLINE
> - register_die_notifier(&kasan_die_notifier);
> -#endif
> -
> memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
>
> /*
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 4f404c565db1..e0238af0388f 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -225,4 +225,10 @@ static inline void kasan_release_vmalloc(unsigned long start,
> unsigned long free_region_end) {}
> #endif
>
> +#ifdef CONFIG_KASAN_INLINE
> +void kasan_non_canonical_hook(unsigned long addr);
> +#else /* CONFIG_KASAN_INLINE */
> +static inline void kasan_non_canonical_hook(unsigned long addr) { }
> +#endif /* CONFIG_KASAN_INLINE */
> +
> #endif /* LINUX_KASAN_H */
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index 621782100eaa..5ef9f24f566b 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -512,3 +512,43 @@ void __kasan_report(unsigned long addr, size_t size, bool is_write, unsigned lon
>
> end_report(&flags);
> }
> +
> +#ifdef CONFIG_KASAN_INLINE
> +/*
> + * With CONFIG_KASAN_INLINE, accesses to bogus pointers (outside the high
> + * canonical half of the address space) cause out-of-bounds shadow memory reads
> + * before the actual access. For addresses in the low canonical half of the
> + * address space, as well as most non-canonical addresses, that out-of-bounds
> + * shadow memory access lands in the non-canonical part of the address space.
> + * Help the user figure out what the original bogus pointer was.
> + */
> +void kasan_non_canonical_hook(unsigned long addr)
> +{
> + unsigned long orig_addr;
> + const char *bug_type;
> +
> + if (addr < KASAN_SHADOW_OFFSET)
> + return;
> +
> + orig_addr = (addr - KASAN_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT;
> + /*
> + * For faults near the shadow address for NULL, we can be fairly certain
> + * that this is a KASAN shadow memory access.
> + * For faults that correspond to shadow for low canonical addresses, we
> + * can still be pretty sure - that shadow region is a fairly narrow
> + * chunk of the non-canonical address space.
> + * But faults that look like shadow for non-canonical addresses are a
> + * really large chunk of the address space. In that case, we still
> + * print the decoded address, but make it clear that this is not
> + * necessarily what's actually going on.
> + */
> + if (orig_addr < PAGE_SIZE)
> + bug_type = "null-ptr-deref";
> + else if (orig_addr < TASK_SIZE)
> + bug_type = "probably user-memory-access";
> + else
> + bug_type = "maybe wild-memory-access";
> + pr_alert("KASAN: %s in range [0x%016lx-0x%016lx]\n", bug_type,
> + orig_addr, orig_addr + KASAN_SHADOW_MASK);
> +}
> +#endif
Reviewed-by: Dmitry Vyukov <[email protected]>
Thanks!
On Thu, Dec 19, 2019 at 12:11:49AM +0100, Jann Horn wrote:
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index c8b4ae6aed5b..4c691bb9e0d9 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -621,7 +621,10 @@ do_general_protection(struct pt_regs *regs, long error_code)
> "maybe for address",
> gp_addr);
> - die(desc, regs, error_code);
I've added here:
/*
* KASAN is interested only in the non-canonical case, clear it
* otherwise.
*/
> + if (hint != GP_NON_CANONICAL)
> + gp_addr = 0;
otherwise you have:
if (hint != GP_NO_HINT)
...
if (hint != GP_NON_CANONICAL)
...
which is kinda confusing at a first glance and one has to follow the
code into die_addr() to figure out the usage of the address argument.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
... and a cleanup ontop:
---
From: Borislav Petkov <[email protected]>
Date: Tue, 31 Dec 2019 17:15:35 +0100
Hoist the user_mode() case up because it is less code and can be dealt
with up-front like the other special cases UMIP and vm86.
This saves an indentation level for the kernel-mode #GP case and allows
to "unfold" the code more so that it is more readable.
No functional changes.
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: [email protected]
---
arch/x86/kernel/traps.c | 79 +++++++++++++++++++++--------------------
1 file changed, 40 insertions(+), 39 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 2afd7d8d4007..ca395ad28b4e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -567,7 +567,10 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
{
char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
+ enum kernel_gp_hint hint = GP_NO_HINT;
struct task_struct *tsk;
+ unsigned long gp_addr;
+ int ret;
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
cond_local_irq_enable(regs);
@@ -584,58 +587,56 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
}
tsk = current;
- if (!user_mode(regs)) {
- enum kernel_gp_hint hint = GP_NO_HINT;
- unsigned long gp_addr;
-
- if (fixup_exception(regs, X86_TRAP_GP, error_code, 0))
- return;
+ if (user_mode(regs)) {
tsk->thread.error_code = error_code;
tsk->thread.trap_nr = X86_TRAP_GP;
- /*
- * To be potentially processing a kprobe fault and to
- * trust the result from kprobe_running(), we have to
- * be non-preemptible.
- */
- if (!preemptible() && kprobe_running() &&
- kprobe_fault_handler(regs, X86_TRAP_GP))
- return;
+ show_signal(tsk, SIGSEGV, "", desc, regs, error_code);
+ force_sig(SIGSEGV);
- if (notify_die(DIE_GPF, desc, regs, error_code,
- X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP)
- return;
+ return;
+ }
- if (error_code)
- snprintf(desc, sizeof(desc), "segment-related " GPFSTR);
- else
- hint = get_kernel_gp_address(regs, &gp_addr);
+ if (fixup_exception(regs, X86_TRAP_GP, error_code, 0))
+ return;
- if (hint != GP_NO_HINT)
- snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
- (hint == GP_NON_CANONICAL) ?
- "probably for non-canonical address" :
- "maybe for address",
- gp_addr);
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_nr = X86_TRAP_GP;
- /*
- * KASAN is interested only in the non-canonical case, clear it
- * otherwise.
- */
- if (hint != GP_NON_CANONICAL)
- gp_addr = 0;
+ /*
+ * To be potentially processing a kprobe fault and to trust the result
+ * from kprobe_running(), we have to be non-preemptible.
+ */
+ if (!preemptible() &&
+ kprobe_running() &&
+ kprobe_fault_handler(regs, X86_TRAP_GP))
+ return;
- die_addr(desc, regs, error_code, gp_addr);
+ ret = notify_die(DIE_GPF, desc, regs, error_code, X86_TRAP_GP, SIGSEGV);
+ if (ret == NOTIFY_STOP)
return;
- }
- tsk->thread.error_code = error_code;
- tsk->thread.trap_nr = X86_TRAP_GP;
+ if (error_code)
+ snprintf(desc, sizeof(desc), "segment-related " GPFSTR);
+ else
+ hint = get_kernel_gp_address(regs, &gp_addr);
+
+ if (hint != GP_NO_HINT)
+ snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
+ (hint == GP_NON_CANONICAL) ? "probably for non-canonical address"
+ : "maybe for address",
+ gp_addr);
+
+ /*
+ * KASAN is interested only in the non-canonical case, clear it
+ * otherwise.
+ */
+ if (hint != GP_NON_CANONICAL)
+ gp_addr = 0;
- show_signal(tsk, SIGSEGV, "", desc, regs, error_code);
+ die_addr(desc, regs, error_code, gp_addr);
- force_sig(SIGSEGV);
}
NOKPROBE_SYMBOL(do_general_protection);
--
2.21.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The following commit has been merged into the x86/core branch of tip:
Commit-ID: 7be4412721aee25e35583a20a896085dc6b99c3e
Gitweb: https://git.kernel.org/tip/7be4412721aee25e35583a20a896085dc6b99c3e
Author: Jann Horn <[email protected]>
AuthorDate: Thu, 19 Dec 2019 00:11:47 +01:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Mon, 30 Dec 2019 20:17:15 +01:00
x86/insn-eval: Add support for 64-bit kernel mode
To support evaluating 64-bit kernel mode instructions:
* Replace existing checks for user_64bit_mode() with a new helper that
checks whether code is being executed in either 64-bit kernel mode or
64-bit user mode.
* Select the GS base depending on whether the instruction is being
evaluated in kernel mode.
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Oleg Nesterov <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/include/asm/ptrace.h | 13 +++++++++++++
arch/x86/lib/insn-eval.c | 26 +++++++++++++++-----------
2 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 5057a8e..ac45b06 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -159,6 +159,19 @@ static inline bool user_64bit_mode(struct pt_regs *regs)
#endif
}
+/*
+ * Determine whether the register set came from any context that is running in
+ * 64-bit mode.
+ */
+static inline bool any_64bit_mode(struct pt_regs *regs)
+{
+#ifdef CONFIG_X86_64
+ return !user_mode(regs) || user_64bit_mode(regs);
+#else
+ return false;
+#endif
+}
+
#ifdef CONFIG_X86_64
#define current_user_stack_pointer() current_pt_regs()->sp
#define compat_user_stack_pointer() current_pt_regs()->sp
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 306c3a0..31600d8 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -155,7 +155,7 @@ static bool check_seg_overrides(struct insn *insn, int regoff)
*/
static int resolve_default_seg(struct insn *insn, struct pt_regs *regs, int off)
{
- if (user_64bit_mode(regs))
+ if (any_64bit_mode(regs))
return INAT_SEG_REG_IGNORE;
/*
* Resolve the default segment register as described in Section 3.7.4
@@ -266,7 +266,7 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
* which may be invalid at this point.
*/
if (regoff == offsetof(struct pt_regs, ip)) {
- if (user_64bit_mode(regs))
+ if (any_64bit_mode(regs))
return INAT_SEG_REG_IGNORE;
else
return INAT_SEG_REG_CS;
@@ -289,7 +289,7 @@ static int resolve_seg_reg(struct insn *insn, struct pt_regs *regs, int regoff)
* In long mode, segment override prefixes are ignored, except for
* overrides for FS and GS.
*/
- if (user_64bit_mode(regs)) {
+ if (any_64bit_mode(regs)) {
if (idx != INAT_SEG_REG_FS &&
idx != INAT_SEG_REG_GS)
idx = INAT_SEG_REG_IGNORE;
@@ -646,23 +646,27 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)
*/
return (unsigned long)(sel << 4);
- if (user_64bit_mode(regs)) {
+ if (any_64bit_mode(regs)) {
/*
* Only FS or GS will have a base address, the rest of
* the segments' bases are forced to 0.
*/
unsigned long base;
- if (seg_reg_idx == INAT_SEG_REG_FS)
+ if (seg_reg_idx == INAT_SEG_REG_FS) {
rdmsrl(MSR_FS_BASE, base);
- else if (seg_reg_idx == INAT_SEG_REG_GS)
+ } else if (seg_reg_idx == INAT_SEG_REG_GS) {
/*
* swapgs was called at the kernel entry point. Thus,
* MSR_KERNEL_GS_BASE will have the user-space GS base.
*/
- rdmsrl(MSR_KERNEL_GS_BASE, base);
- else
+ if (user_mode(regs))
+ rdmsrl(MSR_KERNEL_GS_BASE, base);
+ else
+ rdmsrl(MSR_GS_BASE, base);
+ } else {
base = 0;
+ }
return base;
}
@@ -703,7 +707,7 @@ static unsigned long get_seg_limit(struct pt_regs *regs, int seg_reg_idx)
if (sel < 0)
return 0;
- if (user_64bit_mode(regs) || v8086_mode(regs))
+ if (any_64bit_mode(regs) || v8086_mode(regs))
return -1L;
if (!sel)
@@ -948,7 +952,7 @@ static int get_eff_addr_modrm(struct insn *insn, struct pt_regs *regs,
* following instruction.
*/
if (*regoff == -EDOM) {
- if (user_64bit_mode(regs))
+ if (any_64bit_mode(regs))
tmp = regs->ip + insn->length;
else
tmp = 0;
@@ -1250,7 +1254,7 @@ static void __user *get_addr_ref_32(struct insn *insn, struct pt_regs *regs)
* After computed, the effective address is treated as an unsigned
* quantity.
*/
- if (!user_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit))
+ if (!any_64bit_mode(regs) && ((unsigned int)eff_addr > seg_limit))
goto out;
/*
The following commit has been merged into the x86/core branch of tip:
Commit-ID: 59c1dcbed5b51cab543be8f47b6d7d9cf107ec94
Gitweb: https://git.kernel.org/tip/59c1dcbed5b51cab543be8f47b6d7d9cf107ec94
Author: Jann Horn <[email protected]>
AuthorDate: Thu, 19 Dec 2019 00:11:48 +01:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Tue, 31 Dec 2019 12:31:13 +01:00
x86/traps: Print address on #GP
A frequent cause of #GP exceptions are memory accesses to non-canonical
addresses. Unlike #PF, #GP doesn't report a fault address in CR2, so the
kernel doesn't currently print the fault address for a #GP.
Luckily, the necessary infrastructure for decoding x86 instructions and
computing the memory address being accessed is already present. Hook
it up to the #GP handler so that the address operand of the faulting
instruction can be figured out and printed.
Distinguish two cases:
a) (Part of) the memory range being accessed lies in the non-canonical
address range; in this case, it is likely that the decoded address
is actually the one that caused the #GP.
b) The entire memory range of the decoded operand lies in canonical
address space; the #GP may or may not be related in some way to the
computed address. Print it, but with hedging language in the message.
While it is already possible to compute the faulting address manually by
disassembling the opcode dump and evaluating the instruction against the
register dump, this should make it slightly easier to identify crashes
at a glance.
Note that the operand length which comes from the instruction decoder
and is used to determine whether the access straddles into non-canonical
address space, is currently somewhat unreliable; but it should be good
enough, considering that Linux on x86-64 never maps the page directly
before the start of the non-canonical range anyway, and therefore the
case where a memory range begins in that page and potentially straddles
into the non-canonical range should be fairly uncommon.
In the case the address is still computed wrongly, it only influences
whether the error message claims that the access is canonical.
[ bp: Remove ambiguous "we", massage, reflow comments and spacing. ]
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Tested-by: Sean Christopherson <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/traps.c | 72 +++++++++++++++++++++++++++++++++++++---
1 file changed, 67 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 05da6b5..108ab1e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -56,6 +56,8 @@
#include <asm/mpx.h>
#include <asm/vm86.h>
#include <asm/umip.h>
+#include <asm/insn.h>
+#include <asm/insn-eval.h>
#ifdef CONFIG_X86_64
#include <asm/x86_init.h>
@@ -518,10 +520,53 @@ exit_trap:
do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, 0, NULL);
}
-dotraplinkage void
-do_general_protection(struct pt_regs *regs, long error_code)
+enum kernel_gp_hint {
+ GP_NO_HINT,
+ GP_NON_CANONICAL,
+ GP_CANONICAL
+};
+
+/*
+ * When an uncaught #GP occurs, try to determine the memory address accessed by
+ * the instruction and return that address to the caller. Also, try to figure
+ * out whether any part of the access to that address was non-canonical.
+ */
+static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
+ unsigned long *addr)
{
- const char *desc = "general protection fault";
+ u8 insn_buf[MAX_INSN_SIZE];
+ struct insn insn;
+
+ if (probe_kernel_read(insn_buf, (void *)regs->ip, MAX_INSN_SIZE))
+ return GP_NO_HINT;
+
+ kernel_insn_init(&insn, insn_buf, MAX_INSN_SIZE);
+ insn_get_modrm(&insn);
+ insn_get_sib(&insn);
+
+ *addr = (unsigned long)insn_get_addr_ref(&insn, regs);
+ if (*addr == -1UL)
+ return GP_NO_HINT;
+
+#ifdef CONFIG_X86_64
+ /*
+ * Check that:
+ * - the operand is not in the kernel half
+ * - the last byte of the operand is not in the user canonical half
+ */
+ if (*addr < ~__VIRTUAL_MASK &&
+ *addr + insn.opnd_bytes - 1 > __VIRTUAL_MASK)
+ return GP_NON_CANONICAL;
+#endif
+
+ return GP_CANONICAL;
+}
+
+#define GPFSTR "general protection fault"
+
+dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
+{
+ char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
struct task_struct *tsk;
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
@@ -540,6 +585,9 @@ do_general_protection(struct pt_regs *regs, long error_code)
tsk = current;
if (!user_mode(regs)) {
+ enum kernel_gp_hint hint = GP_NO_HINT;
+ unsigned long gp_addr;
+
if (fixup_exception(regs, X86_TRAP_GP, error_code, 0))
return;
@@ -556,8 +604,22 @@ do_general_protection(struct pt_regs *regs, long error_code)
return;
if (notify_die(DIE_GPF, desc, regs, error_code,
- X86_TRAP_GP, SIGSEGV) != NOTIFY_STOP)
- die(desc, regs, error_code);
+ X86_TRAP_GP, SIGSEGV) == NOTIFY_STOP)
+ return;
+
+ if (error_code)
+ snprintf(desc, sizeof(desc), "segment-related " GPFSTR);
+ else
+ hint = get_kernel_gp_address(regs, &gp_addr);
+
+ if (hint != GP_NO_HINT)
+ snprintf(desc, sizeof(desc), GPFSTR ", %s 0x%lx",
+ (hint == GP_NON_CANONICAL) ?
+ "probably for non-canonical address" :
+ "maybe for address",
+ gp_addr);
+
+ die(desc, regs, error_code);
return;
}
The following commit has been merged into the x86/core branch of tip:
Commit-ID: 2f004eea0fc8f86b45dfc2007add2d4986de8d02
Gitweb: https://git.kernel.org/tip/2f004eea0fc8f86b45dfc2007add2d4986de8d02
Author: Jann Horn <[email protected]>
AuthorDate: Thu, 19 Dec 2019 00:11:50 +01:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Tue, 31 Dec 2019 13:15:38 +01:00
x86/kasan: Print original address on #GP
Make #GP exceptions caused by out-of-bounds KASAN shadow accesses easier
to understand by computing the address of the original access and
printing that. More details are in the comments in the patch.
This turns an error like this:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault, probably for non-canonical address
0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
into this:
general protection fault, probably for non-canonical address
0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: maybe wild-memory-access in range
[0x00badbeefbadbee8-0x00badbeefbadbeef]
The hook is placed in architecture-independent code, but is currently
only wired up to the X86 exception handler because I'm not sufficiently
familiar with the address space layout and exception handling mechanisms
on other architectures.
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: linux-mm <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/dumpstack.c | 2 ++-
arch/x86/mm/kasan_init_64.c | 21 +-------------------
include/linux/kasan.h | 6 +++++-
mm/kasan/report.c | 40 ++++++++++++++++++++++++++++++++++++-
4 files changed, 48 insertions(+), 21 deletions(-)
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 8995bf1..ae64ec7 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -427,6 +427,8 @@ void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr)
int sig = SIGSEGV;
__die_header(str, regs, err);
+ if (gp_addr)
+ kasan_non_canonical_hook(gp_addr);
if (__die_body(str, regs, err))
sig = 0;
oops_end(flags, regs, sig);
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index cf5bc37..763e71a 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -288,23 +288,6 @@ static void __init kasan_shallow_populate_pgds(void *start, void *end)
} while (pgd++, addr = next, addr != (unsigned long)end);
}
-#ifdef CONFIG_KASAN_INLINE
-static int kasan_die_handler(struct notifier_block *self,
- unsigned long val,
- void *data)
-{
- if (val == DIE_GPF) {
- pr_emerg("CONFIG_KASAN_INLINE enabled\n");
- pr_emerg("GPF could be caused by NULL-ptr deref or user memory access\n");
- }
- return NOTIFY_OK;
-}
-
-static struct notifier_block kasan_die_notifier = {
- .notifier_call = kasan_die_handler,
-};
-#endif
-
void __init kasan_early_init(void)
{
int i;
@@ -341,10 +324,6 @@ void __init kasan_init(void)
int i;
void *shadow_cpu_entry_begin, *shadow_cpu_entry_end;
-#ifdef CONFIG_KASAN_INLINE
- register_die_notifier(&kasan_die_notifier);
-#endif
-
memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
/*
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index e18fe54..5cde9e7 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -228,4 +228,10 @@ static inline void kasan_release_vmalloc(unsigned long start,
unsigned long free_region_end) {}
#endif
+#ifdef CONFIG_KASAN_INLINE
+void kasan_non_canonical_hook(unsigned long addr);
+#else /* CONFIG_KASAN_INLINE */
+static inline void kasan_non_canonical_hook(unsigned long addr) { }
+#endif /* CONFIG_KASAN_INLINE */
+
#endif /* LINUX_KASAN_H */
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 6217821..5ef9f24 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -512,3 +512,43 @@ void __kasan_report(unsigned long addr, size_t size, bool is_write, unsigned lon
end_report(&flags);
}
+
+#ifdef CONFIG_KASAN_INLINE
+/*
+ * With CONFIG_KASAN_INLINE, accesses to bogus pointers (outside the high
+ * canonical half of the address space) cause out-of-bounds shadow memory reads
+ * before the actual access. For addresses in the low canonical half of the
+ * address space, as well as most non-canonical addresses, that out-of-bounds
+ * shadow memory access lands in the non-canonical part of the address space.
+ * Help the user figure out what the original bogus pointer was.
+ */
+void kasan_non_canonical_hook(unsigned long addr)
+{
+ unsigned long orig_addr;
+ const char *bug_type;
+
+ if (addr < KASAN_SHADOW_OFFSET)
+ return;
+
+ orig_addr = (addr - KASAN_SHADOW_OFFSET) << KASAN_SHADOW_SCALE_SHIFT;
+ /*
+ * For faults near the shadow address for NULL, we can be fairly certain
+ * that this is a KASAN shadow memory access.
+ * For faults that correspond to shadow for low canonical addresses, we
+ * can still be pretty sure - that shadow region is a fairly narrow
+ * chunk of the non-canonical address space.
+ * But faults that look like shadow for non-canonical addresses are a
+ * really large chunk of the address space. In that case, we still
+ * print the decoded address, but make it clear that this is not
+ * necessarily what's actually going on.
+ */
+ if (orig_addr < PAGE_SIZE)
+ bug_type = "null-ptr-deref";
+ else if (orig_addr < TASK_SIZE)
+ bug_type = "probably user-memory-access";
+ else
+ bug_type = "maybe wild-memory-access";
+ pr_alert("KASAN: %s in range [0x%016lx-0x%016lx]\n", bug_type,
+ orig_addr, orig_addr + KASAN_SHADOW_MASK);
+}
+#endif
The following commit has been merged into the x86/core branch of tip:
Commit-ID: aa49f20462c90df4150f33d245cbcfe0d9c80350
Gitweb: https://git.kernel.org/tip/aa49f20462c90df4150f33d245cbcfe0d9c80350
Author: Jann Horn <[email protected]>
AuthorDate: Thu, 19 Dec 2019 00:11:49 +01:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Tue, 31 Dec 2019 13:11:35 +01:00
x86/dumpstack: Introduce die_addr() for die() with #GP fault address
Split __die() into __die_header() and __die_body(). This allows inserting
extra information below the header line that initiates the bug report.
Introduce a new function die_addr() that behaves like die(), but is for
faults only and uses __die_header() and __die_body() so that a future
commit can print extra information after the header line.
[ bp: Comment the KASAN-specific usage of gp_addr. ]
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: "Peter Zijlstra (Intel)" <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/include/asm/kdebug.h | 1 +
arch/x86/kernel/dumpstack.c | 24 +++++++++++++++++++++++-
arch/x86/kernel/traps.c | 9 ++++++++-
3 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kdebug.h b/arch/x86/include/asm/kdebug.h
index 75f1e35..247ab14 100644
--- a/arch/x86/include/asm/kdebug.h
+++ b/arch/x86/include/asm/kdebug.h
@@ -33,6 +33,7 @@ enum show_regs_mode {
};
extern void die(const char *, struct pt_regs *,long);
+void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr);
extern int __must_check __die(const char *, struct pt_regs *, long);
extern void show_stack_regs(struct pt_regs *regs);
extern void __show_regs(struct pt_regs *regs, enum show_regs_mode);
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index e07424e..8995bf1 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -365,7 +365,7 @@ void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
}
NOKPROBE_SYMBOL(oops_end);
-int __die(const char *str, struct pt_regs *regs, long err)
+static void __die_header(const char *str, struct pt_regs *regs, long err)
{
const char *pr = "";
@@ -384,7 +384,11 @@ int __die(const char *str, struct pt_regs *regs, long err)
IS_ENABLED(CONFIG_KASAN) ? " KASAN" : "",
IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION) ?
(boot_cpu_has(X86_FEATURE_PTI) ? " PTI" : " NOPTI") : "");
+}
+NOKPROBE_SYMBOL(__die_header);
+static int __die_body(const char *str, struct pt_regs *regs, long err)
+{
show_regs(regs);
print_modules();
@@ -394,6 +398,13 @@ int __die(const char *str, struct pt_regs *regs, long err)
return 0;
}
+NOKPROBE_SYMBOL(__die_body);
+
+int __die(const char *str, struct pt_regs *regs, long err)
+{
+ __die_header(str, regs, err);
+ return __die_body(str, regs, err);
+}
NOKPROBE_SYMBOL(__die);
/*
@@ -410,6 +421,17 @@ void die(const char *str, struct pt_regs *regs, long err)
oops_end(flags, regs, sig);
}
+void die_addr(const char *str, struct pt_regs *regs, long err, long gp_addr)
+{
+ unsigned long flags = oops_begin();
+ int sig = SIGSEGV;
+
+ __die_header(str, regs, err);
+ if (__die_body(str, regs, err))
+ sig = 0;
+ oops_end(flags, regs, sig);
+}
+
void show_regs(struct pt_regs *regs)
{
show_regs_print_info(KERN_DEFAULT);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 108ab1e..2afd7d8 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -619,7 +619,14 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
"maybe for address",
gp_addr);
- die(desc, regs, error_code);
+ /*
+ * KASAN is interested only in the non-canonical case, clear it
+ * otherwise.
+ */
+ if (hint != GP_NON_CANONICAL)
+ gp_addr = 0;
+
+ die_addr(desc, regs, error_code, gp_addr);
return;
}
On Thu, Dec 19, 2019 at 12:11:47AM +0100, Jann Horn wrote:
> To support evaluating 64-bit kernel mode instructions:
>
> Replace existing checks for user_64bit_mode() with a new helper that
> checks whether code is being executed in either 64-bit kernel mode or
> 64-bit user mode.
>
> Select the GS base depending on whether the instruction is being
> evaluated in kernel mode.
>
> Signed-off-by: Jann Horn <[email protected]>
In most cases you have struct insn around (or can easily pass it down to
the place). Why not use insn->x86_64?
--
Kirill A. Shutemov
> On Jan 2, 2020, at 4:47 PM, Kirill A. Shutemov <[email protected]> wrote:
>
> On Thu, Dec 19, 2019 at 12:11:47AM +0100, Jann Horn wrote:
>> To support evaluating 64-bit kernel mode instructions:
>>
>> Replace existing checks for user_64bit_mode() with a new helper that
>> checks whether code is being executed in either 64-bit kernel mode or
>> 64-bit user mode.
>>
>> Select the GS base depending on whether the instruction is being
>> evaluated in kernel mode.
>>
>> Signed-off-by: Jann Horn <[email protected]>
>
> In most cases you have struct insn around (or can easily pass it down to
> the place). Why not use insn->x86_64?
>
>
What populates that?
FWIW, this code is a bit buggy: it gets EFI mixed mode wrong. I’m not entirely sure we care.
On Thu, Jan 02, 2020 at 04:55:22PM +0900, Andy Lutomirski wrote:
> > In most cases you have struct insn around (or can easily pass it down to
> > the place). Why not use insn->x86_64?
>
> What populates that?
insn_init() AFAICT.
However, you have cases where you don't have struct insn:
fixup_umip_exception() uses it and it calls insn_get_seg_base() which
does use it too.
> FWIW, this code is a bit buggy: it gets EFI mixed mode wrong. I’m
> not entirely sure we care.
We'll cross that bridge when we get there, I'd say.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Jan 02, 2020 at 10:27:33AM +0100, Borislav Petkov wrote:
> On Thu, Jan 02, 2020 at 04:55:22PM +0900, Andy Lutomirski wrote:
> > > In most cases you have struct insn around (or can easily pass it down to
> > > the place). Why not use insn->x86_64?
> >
> > What populates that?
>
> insn_init() AFAICT.
>
> However, you have cases where you don't have struct insn:
> fixup_umip_exception() uses it and it calls insn_get_seg_base() which
> does use it too.
Caller can indicate the bitness directly. It's always 32-bit for UMIP and
get_seg_base_limit() can use insn->x86_64.
--
Kirill A. Shutemov
On Thu, Jan 02, 2020 at 12:49:46PM +0300, Kirill A. Shutemov wrote:
> Caller can indicate the bitness directly. It's always 32-bit for UMIP and
> get_seg_base_limit() can use insn->x86_64.
You can always send patches.
Just don't "fix" it for the sake of fixing it and the "cleanup" should
really be a cleanup not just converting it to something else. Oh, and
you need to test it too...
But you know all that. :-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette