childregs represents the registers which are active for the new thread
in user context. For a kernel thread, childregs->gp is never used since
the kernel gp is not touched by switch_to. For a user mode helper, the
gp value can be observed in user space after execve or possibly by other
means.
Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
Signed-off-by: Stefan O'Rear <[email protected]>
---
arch/riscv/kernel/process.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 92922dbd5b5c..51042f48da17 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -27,8 +27,6 @@
#include <asm/vector.h>
#include <asm/cpufeature.h>
-register unsigned long gp_in_global __asm__("gp");
-
#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
#include <linux/stackprotector.h>
unsigned long __stack_chk_guard __read_mostly;
@@ -207,7 +205,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
if (unlikely(args->fn)) {
/* Kernel thread */
memset(childregs, 0, sizeof(struct pt_regs));
- childregs->gp = gp_in_global;
/* Supervisor/Machine, irqs on: */
childregs->status = SR_PP | SR_PIE;
--
2.40.1
Hi Stefan,
On Wed, Mar 27, 2024 at 2:14 PM Stefan O'Rear <[email protected]> wrote:
>
> childregs represents the registers which are active for the new thread
> in user context. For a kernel thread, childregs->gp is never used since
> the kernel gp is not touched by switch_to. For a user mode helper, the
> gp value can be observed in user space after execve or possibly by other
> means.
>
> Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
> Signed-off-by: Stefan O'Rear <[email protected]>
> ---
> arch/riscv/kernel/process.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> index 92922dbd5b5c..51042f48da17 100644
> --- a/arch/riscv/kernel/process.c
> +++ b/arch/riscv/kernel/process.c
> @@ -27,8 +27,6 @@
> #include <asm/vector.h>
> #include <asm/cpufeature.h>
>
> -register unsigned long gp_in_global __asm__("gp");
> -
> #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
> #include <linux/stackprotector.h>
> unsigned long __stack_chk_guard __read_mostly;
> @@ -207,7 +205,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> if (unlikely(args->fn)) {
> /* Kernel thread */
> memset(childregs, 0, sizeof(struct pt_regs));
> - childregs->gp = gp_in_global;
> /* Supervisor/Machine, irqs on: */
> childregs->status = SR_PP | SR_PIE;
>
> --
> 2.40.1
>
>
Can you help express in more detail what the problem was before fixing it?
Thanks,
Yunhui
On Wed, Mar 27, 2024, at 4:43 AM, yunhui cui wrote:
> Hi Stefan,
>
> On Wed, Mar 27, 2024 at 2:14 PM Stefan O'Rear <sorear@fastmailcom> wrote:
>>
>> childregs represents the registers which are active for the new thread
>> in user context. For a kernel thread, childregs->gp is never used since
>> the kernel gp is not touched by switch_to. For a user mode helper, the
>> gp value can be observed in user space after execve or possibly by other
>> means.
>>
>> Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
>> Signed-off-by: Stefan O'Rear <[email protected]>
>> ---
>> arch/riscv/kernel/process.c | 3 ---
>> 1 file changed, 3 deletions(-)
>>
>> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
>> index 92922dbd5b5c..51042f48da17 100644
>> --- a/arch/riscv/kernel/process.c
>> +++ b/arch/riscv/kernel/process.c
>> @@ -27,8 +27,6 @@
>> #include <asm/vector.h>
>> #include <asm/cpufeature.h>
>>
>> -register unsigned long gp_in_global __asm__("gp");
>> -
>> #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
>> #include <linux/stackprotector.h>
>> unsigned long __stack_chk_guard __read_mostly;
>> @@ -207,7 +205,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
>> if (unlikely(args->fn)) {
>> /* Kernel thread */
>> memset(childregs, 0, sizeof(struct pt_regs));
>> - childregs->gp = gp_in_global;
>> /* Supervisor/Machine, irqs on: */
>> childregs->status = SR_PP | SR_PIE;
>>
>> --
>> 2.40.1
>>
>>
> Can you help express in more detail what the problem was before fixing it?
It's a KASLR bypass, since gp_in_global is the address of the kernel symbol
__global_pointer$.
The /* Kernel thread */ comment is somewhat inaccurate in that it is also used
for user_mode_helper threads, which exec a user process, e.g. /sbin/init or
when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have
PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.
childregs is the *user* context during syscall execution and it is observable
from userspace in at least five ways:
1. kernel_execve does not currently clear integer registers, so the starting
register state for PID 1 and other user processes started by the kernel has
sp = user stack, gp = kernel __global_pointer$, all other integer registers
zeroed by the memset in the patch comment.
This is a bug in its own right, but I'm unwilling to bet that it is the only
way to exploit the issue addressed by this patch.
2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread
before it execs, but ptrace requires SIGSTOP to be delivered which can only
happen at user/kernel boundaries.
3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for
user_mode_helpers before the exec completes, but gp is not one of the
registers it returns.
4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel
addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses
are also exposed via PERF_SAMPLE_REGS_USER which is permitted under
LOCKDOWN_PERF. I have not attempted to write exploit code.
5. Much of the tracing infrastructure allows access to user registers. I have
not attempted to determine which forms of tracing allow access to user
registers without already allowing access to kernel registers.
Does this help? How much of this should be in the commit message?
-s
> Thanks,
> Yunhui
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Hi Stefan,
On Thu, Mar 28, 2024 at 12:54 AM Stefan O'Rear <[email protected]> wrote:
>
> On Wed, Mar 27, 2024, at 4:43 AM, yunhui cui wrote:
> > Hi Stefan,
> >
> > On Wed, Mar 27, 2024 at 2:14 PM Stefan O'Rear <[email protected]> wrote:
> >>
> >> childregs represents the registers which are active for the new thread
> >> in user context. For a kernel thread, childregs->gp is never used since
> >> the kernel gp is not touched by switch_to. For a user mode helper, the
> >> gp value can be observed in user space after execve or possibly by other
> >> means.
> >>
> >> Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
> >> Signed-off-by: Stefan O'Rear <[email protected]>
> >> ---
> >> arch/riscv/kernel/process.c | 3 ---
> >> 1 file changed, 3 deletions(-)
> >>
> >> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> >> index 92922dbd5b5c..51042f48da17 100644
> >> --- a/arch/riscv/kernel/process.c
> >> +++ b/arch/riscv/kernel/process.c
> >> @@ -27,8 +27,6 @@
> >> #include <asm/vector.h>
> >> #include <asm/cpufeature.h>
> >>
> >> -register unsigned long gp_in_global __asm__("gp");
> >> -
> >> #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
> >> #include <linux/stackprotector.h>
> >> unsigned long __stack_chk_guard __read_mostly;
> >> @@ -207,7 +205,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> >> if (unlikely(args->fn)) {
> >> /* Kernel thread */
> >> memset(childregs, 0, sizeof(struct pt_regs));
> >> - childregs->gp = gp_in_global;
> >> /* Supervisor/Machine, irqs on: */
> >> childregs->status = SR_PP | SR_PIE;
> >>
> >> --
> >> 2.40.1
> >>
> >>
> > Can you help express in more detail what the problem was before fixing it?
>
> It's a KASLR bypass, since gp_in_global is the address of the kernel symbol
> __global_pointer$.
>
> The /* Kernel thread */ comment is somewhat inaccurate in that it is also used
> for user_mode_helper threads, which exec a user process, e.g. /sbin/init or
> when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have
> PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.
>
> childregs is the *user* context during syscall execution and it is observable
> from userspace in at least five ways:
>
> 1. kernel_execve does not currently clear integer registers, so the starting
> register state for PID 1 and other user processes started by the kernel has
> sp = user stack, gp = kernel __global_pointer$, all other integer registers
> zeroed by the memset in the patch comment.
>
> This is a bug in its own right, but I'm unwilling to bet that it is the only
> way to exploit the issue addressed by this patch.
>
> 2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread
> before it execs, but ptrace requires SIGSTOP to be delivered which can only
> happen at user/kernel boundaries.
>
> 3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for
> user_mode_helpers before the exec completes, but gp is not one of the
> registers it returns.
>
> 4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel
> addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses
> are also exposed via PERF_SAMPLE_REGS_USER which is permitted under
> LOCKDOWN_PERF. I have not attempted to write exploit code.
>
> 5. Much of the tracing infrastructure allows access to user registers. I have
> not attempted to determine which forms of tracing allow access to user
> registers without already allowing access to kernel registers.
>
> Does this help? How much of this should be in the commit message?
Fine enough, Thanks.
Thanks,
Yunhui
Hi Stefan,
On 27/03/2024 17:53, Stefan O'Rear wrote:
> On Wed, Mar 27, 2024, at 4:43 AM, yunhui cui wrote:
>> Hi Stefan,
>>
>> On Wed, Mar 27, 2024 at 2:14 PM Stefan O'Rear <[email protected]> wrote:
>>> childregs represents the registers which are active for the new thread
>>> in user context. For a kernel thread, childregs->gp is never used since
>>> the kernel gp is not touched by switch_to. For a user mode helper, the
>>> gp value can be observed in user space after execve or possibly by other
>>> means.
>>>
>>> Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
>>> Signed-off-by: Stefan O'Rear <[email protected]>
>>> ---
>>> arch/riscv/kernel/process.c | 3 ---
>>> 1 file changed, 3 deletions(-)
>>>
>>> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
>>> index 92922dbd5b5c..51042f48da17 100644
>>> --- a/arch/riscv/kernel/process.c
>>> +++ b/arch/riscv/kernel/process.c
>>> @@ -27,8 +27,6 @@
>>> #include <asm/vector.h>
>>> #include <asm/cpufeature.h>
>>>
>>> -register unsigned long gp_in_global __asm__("gp");
>>> -
>>> #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
>>> #include <linux/stackprotector.h>
>>> unsigned long __stack_chk_guard __read_mostly;
>>> @@ -207,7 +205,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
>>> if (unlikely(args->fn)) {
>>> /* Kernel thread */
>>> memset(childregs, 0, sizeof(struct pt_regs));
>>> - childregs->gp = gp_in_global;
>>> /* Supervisor/Machine, irqs on: */
>>> childregs->status = SR_PP | SR_PIE;
>>>
>>> --
>>> 2.40.1
>>>
>>>
>> Can you help express in more detail what the problem was before fixing it?
> It's a KASLR bypass, since gp_in_global is the address of the kernel symbol
> __global_pointer$.
>
> The /* Kernel thread */ comment is somewhat inaccurate in that it is also used
> for user_mode_helper threads, which exec a user process, e.g. /sbin/init or
> when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have
> PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.
>
> childregs is the *user* context during syscall execution and it is observable
> from userspace in at least five ways:
>
> 1. kernel_execve does not currently clear integer registers, so the starting
> register state for PID 1 and other user processes started by the kernel has
> sp = user stack, gp = kernel __global_pointer$, all other integer registers
> zeroed by the memset in the patch comment.
So as I did not this know this path really well, I played a bit and I
can confirm that usermode processes reach userspace with the gp = kernel:
Thread 1 hit Breakpoint 12, 0x00007fff82487fc4 in ?? ()
1: x/i $pc
=> 0x7fff82487fc4: mv a0,sp
3: /x $gp = 0xffffffff817fee50
>
> This is a bug in its own right, but I'm unwilling to bet that it is the only
> way to exploit the issue addressed by this patch.
>
> 2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread
> before it execs, but ptrace requires SIGSTOP to be delivered which can only
> happen at user/kernel boundaries.
>
> 3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for
> user_mode_helpers before the exec completes, but gp is not one of the
> registers it returns.
>
> 4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel
> addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses
> are also exposed via PERF_SAMPLE_REGS_USER which is permitted under
> LOCKDOWN_PERF. I have not attempted to write exploit code.
>
> 5. Much of the tracing infrastructure allows access to user registers. I have
> not attempted to determine which forms of tracing allow access to user
> registers without already allowing access to kernel registers.
>
> Does this help? How much of this should be in the commit message?
I'd put them all, but up to you, at least the first usecase that I was
able to reproduce should be added to the commit log.
You can add:
Reviewed-by: Alexandre Ghiti <[email protected]>
And this should go to -fixes.
Thanks,
Alex
>
> -s
>
>> Thanks,
>> Yunhui
>>
>> _______________________________________________
>> linux-riscv mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv
On Tue, 02 Apr 2024 02:21:15 PDT (-0700), [email protected] wrote:
> Hi Stefan,
>
> On 27/03/2024 17:53, Stefan O'Rear wrote:
>> On Wed, Mar 27, 2024, at 4:43 AM, yunhui cui wrote:
>>> Hi Stefan,
>>>
>>> On Wed, Mar 27, 2024 at 2:14 PM Stefan O'Rear <[email protected]> wrote:
>>>> childregs represents the registers which are active for the new thread
>>>> in user context. For a kernel thread, childregs->gp is never used since
>>>> the kernel gp is not touched by switch_to. For a user mode helper, the
>>>> gp value can be observed in user space after execve or possibly by other
>>>> means.
>>>>
>>>> Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
>>>> Signed-off-by: Stefan O'Rear <[email protected]>
>>>> ---
>>>> arch/riscv/kernel/process.c | 3 ---
>>>> 1 file changed, 3 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
>>>> index 92922dbd5b5c..51042f48da17 100644
>>>> --- a/arch/riscv/kernel/process.c
>>>> +++ b/arch/riscv/kernel/process.c
>>>> @@ -27,8 +27,6 @@
>>>> #include <asm/vector.h>
>>>> #include <asm/cpufeature.h>
>>>>
>>>> -register unsigned long gp_in_global __asm__("gp");
>>>> -
>>>> #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
>>>> #include <linux/stackprotector.h>
>>>> unsigned long __stack_chk_guard __read_mostly;
>>>> @@ -207,7 +205,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
>>>> if (unlikely(args->fn)) {
>>>> /* Kernel thread */
>>>> memset(childregs, 0, sizeof(struct pt_regs));
>>>> - childregs->gp = gp_in_global;
>>>> /* Supervisor/Machine, irqs on: */
>>>> childregs->status = SR_PP | SR_PIE;
>>>>
>>>> --
>>>> 2.40.1
>>>>
>>>>
>>> Can you help express in more detail what the problem was before fixing it?
>> It's a KASLR bypass, since gp_in_global is the address of the kernel symbol
>> __global_pointer$.
>>
>> The /* Kernel thread */ comment is somewhat inaccurate in that it is also used
>> for user_mode_helper threads, which exec a user process, e.g. /sbin/init or
>> when /proc/sys/kernel/core_pattern is a pipe. Such threads do not have
>> PF_KTHREAD set and are valid targets for ptrace etc. even before they exec.
>>
>> childregs is the *user* context during syscall execution and it is observable
>> from userspace in at least five ways:
>>
>> 1. kernel_execve does not currently clear integer registers, so the starting
>> register state for PID 1 and other user processes started by the kernel has
>> sp = user stack, gp = kernel __global_pointer$, all other integer registers
>> zeroed by the memset in the patch comment.
>
>
> So as I did not this know this path really well, I played a bit and I
> can confirm that usermode processes reach userspace with the gp = kernel:
>
> Thread 1 hit Breakpoint 12, 0x00007fff82487fc4 in ?? ()
> 1: x/i $pc
> => 0x7fff82487fc4: mv a0,sp
> 3: /x $gp = 0xffffffff817fee50
>
>
>>
>> This is a bug in its own right, but I'm unwilling to bet that it is the only
>> way to exploit the issue addressed by this patch.
>>
>> 2. ptrace(PTRACE_GETREGSET): you can PTRACE_ATTACH to a user_mode_helper thread
>> before it execs, but ptrace requires SIGSTOP to be delivered which can only
>> happen at user/kernel boundaries.
>>
>> 3. /proc/*/task/*/syscall: this is perfectly happy to read pt_regs for
>> user_mode_helpers before the exec completes, but gp is not one of the
>> registers it returns.
>>
>> 4. PERF_SAMPLE_REGS_USER: LOCKDOWN_PERF normally prevents access to kernel
>> addresses via PERF_SAMPLE_REGS_INTR, but due to this bug kernel addresses
>> are also exposed via PERF_SAMPLE_REGS_USER which is permitted under
>> LOCKDOWN_PERF. I have not attempted to write exploit code.
>>
>> 5. Much of the tracing infrastructure allows access to user registers. I have
>> not attempted to determine which forms of tracing allow access to user
>> registers without already allowing access to kernel registers.
>>
>> Does this help? How much of this should be in the commit message?
>
>
> I'd put them all, but up to you, at least the first usecase that I was
> able to reproduce should be added to the commit log.
I just pasted it all in the commit, it seems generally useful for people
running into the commit. With the Link tags maybe it's less important
these days, but I always just err on the side of putting more stuff in
the commit messages.
> You can add:
>
> Reviewed-by: Alexandre Ghiti <[email protected]>
>
> And this should go to -fixes.
It's queued up for the tester.
> Thanks,
>
> Alex
>
>
>>
>> -s
>>
>>> Thanks,
>>> Yunhui
>>>
>>> _______________________________________________
>>> linux-riscv mailing list
>>> [email protected]
>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>> _______________________________________________
>> linux-riscv mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
Hello:
This patch was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <[email protected]>:
On Wed, 27 Mar 2024 02:12:58 -0400 you wrote:
> childregs represents the registers which are active for the new thread
> in user context. For a kernel thread, childregs->gp is never used since
> the kernel gp is not touched by switch_to. For a user mode helper, the
> gp value can be observed in user space after execve or possibly by other
> means.
>
> Fixes: 7db91e57a0ac ("RISC-V: Task implementation")
> Signed-off-by: Stefan O'Rear <[email protected]>
>
> [...]
Here is the summary with links:
- riscv: process: Fix kernel gp leakage
https://git.kernel.org/riscv/c/d14fa1fcf69d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html