2022-11-04 22:40:12

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH] x86/mm: Populate KASAN shadow for per-CPU GDT mapping in CPU entry area

Bounce through cea_map_percpu_pages() when setting protections for the
per-CPU GDT mapping so that KASAN populates a shadow for said mapping.
Failure to populate the shadow will result in a not-present #PF during
KASAN validation if the kernel performs a software lookup into the GDT.

The bug is most easily reproduced by doing a sigreturn with a garbage
CS in the sigcontext, e.g.

int main(void)
{
struct sigcontext regs;

syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);

memset(&regs, 0, sizeof(regs));
regs.cs = 0x1d0;
syscall(__NR_rt_sigreturn);
return 0;
}

to coerce the kernel into doing a GDT lookup to compute CS.base when
reading the instruction bytes on the subsequent #GP to determine whether
or not the #GP is something the kernel should handle, e.g. to fixup UMIP
violations or to emulate CLI/STI for IOPL=3 applications.

BUG: unable to handle page fault for address: fffffbc8379ace00
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 16c03a067 P4D 16c03a067 PUD 15b990067 PMD 15b98f067 PTE 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 3 PID: 851 Comm: r2 Not tainted 6.1.0-rc3-next-20221103+ #432
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kasan_check_range+0xdf/0x190
Call Trace:
<TASK>
get_desc+0xb0/0x1d0
insn_get_seg_base+0x104/0x270
insn_fetch_from_user+0x66/0x80
fixup_umip_exception+0xb1/0x530
exc_general_protection+0x181/0x210
asm_exc_general_protection+0x22/0x30
RIP: 0003:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0003:0000000000000000 EFLAGS: 00000202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000001d0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>

Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Reported-by: [email protected]
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/mm/cpu_entry_area.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
index dff9001e5e12..4a6440461c10 100644
--- a/arch/x86/mm/cpu_entry_area.c
+++ b/arch/x86/mm/cpu_entry_area.c
@@ -195,7 +195,7 @@ static void __init setup_cpu_entry_area(unsigned int cpu)
pgprot_t tss_prot = PAGE_KERNEL;
#endif

- cea_set_pte(&cea->gdt, get_cpu_gdt_paddr(cpu), gdt_prot);
+ cea_map_percpu_pages(&cea->gdt, get_cpu_gdt_rw(cpu), 1, gdt_prot);

cea_map_percpu_pages(&cea->entry_stack_page,
per_cpu_ptr(&entry_stack_storage, cpu), 1,

base-commit: 81214a573d19ae2fa5b528286ba23cd1cb17feec
--
2.38.1.431.g37b22c650d-goog



2022-11-08 20:45:21

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH] x86/mm: Populate KASAN shadow for per-CPU GDT mapping in CPU entry area



On 11/5/22 00:24, Sean Christopherson wrote:
> Bounce through cea_map_percpu_pages() when setting protections for the
> per-CPU GDT mapping so that KASAN populates a shadow for said mapping.
> Failure to populate the shadow will result in a not-present #PF during
> KASAN validation if the kernel performs a software lookup into the GDT.
>
> The bug is most easily reproduced by doing a sigreturn with a garbage
> CS in the sigcontext, e.g.
>
> int main(void)
> {
> struct sigcontext regs;
>
> syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
> syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
> syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>
> memset(&regs, 0, sizeof(regs));
> regs.cs = 0x1d0;
> syscall(__NR_rt_sigreturn);
> return 0;
> }
>
> to coerce the kernel into doing a GDT lookup to compute CS.base when
> reading the instruction bytes on the subsequent #GP to determine whether
> or not the #GP is something the kernel should handle, e.g. to fixup UMIP
> violations or to emulate CLI/STI for IOPL=3 applications.
>
> BUG: unable to handle page fault for address: fffffbc8379ace00
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 16c03a067 P4D 16c03a067 PUD 15b990067 PMD 15b98f067 PTE 0
> Oops: 0000 [#1] PREEMPT SMP KASAN
> CPU: 3 PID: 851 Comm: r2 Not tainted 6.1.0-rc3-next-20221103+ #432
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> RIP: 0010:kasan_check_range+0xdf/0x190
> Call Trace:
> <TASK>
> get_desc+0xb0/0x1d0
> insn_get_seg_base+0x104/0x270
> insn_fetch_from_user+0x66/0x80
> fixup_umip_exception+0xb1/0x530
> exc_general_protection+0x181/0x210
> asm_exc_general_protection+0x22/0x30
> RIP: 0003:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0003:0000000000000000 EFLAGS: 00000202
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000001d0
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
>
> Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
> Reported-by: [email protected]
> Cc: Andrey Ryabinin <[email protected]>
> Cc: Alexander Potapenko <[email protected]>
> Cc: Andrey Konovalov <[email protected]>
> Cc: Dmitry Vyukov <[email protected]>
> Cc: Vincenzo Frascino <[email protected]>
> Cc: [email protected]
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/mm/cpu_entry_area.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
> index dff9001e5e12..4a6440461c10 100644
> --- a/arch/x86/mm/cpu_entry_area.c
> +++ b/arch/x86/mm/cpu_entry_area.c
> @@ -195,7 +195,7 @@ static void __init setup_cpu_entry_area(unsigned int cpu)
> pgprot_t tss_prot = PAGE_KERNEL;
> #endif
>
> - cea_set_pte(&cea->gdt, get_cpu_gdt_paddr(cpu), gdt_prot);
> + cea_map_percpu_pages(&cea->gdt, get_cpu_gdt_rw(cpu), 1, gdt_prot);


I'm thinking using kasan_populate_shadow_for_vaddr() in cea_map_percpu_page() wasn't the right idea.
We should just map shadow for entire 'cea' from setup_cpu_entry_area() instead of fixing it up in random places.
I mean like this:

---
arch/x86/mm/cpu_entry_area.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
index dff9001e5e12..b122fa5e805b 100644
--- a/arch/x86/mm/cpu_entry_area.c
+++ b/arch/x86/mm/cpu_entry_area.c
@@ -195,6 +195,9 @@ static void __init setup_cpu_entry_area(unsigned int cpu)
pgprot_t tss_prot = PAGE_KERNEL;
#endif

+ kasan_populate_shadow_for_vaddr(cea, CPU_ENTRY_AREA_SIZE,
+ early_cpu_to_node(cpu));
+
cea_set_pte(&cea->gdt, get_cpu_gdt_paddr(cpu), gdt_prot);

cea_map_percpu_pages(&cea->entry_stack_page,
--
2.37.4