The pointers for guarded storage and runtime instrumentation control
blocks are stored in the thread_struct of the associated task. These
pointers are initially copied on fork() via arch_dup_task_struct()
and then cleared via copy_thread() before fork() returns. If fork()
happens to fail after the initial task dup and before copy_thread(),
the newly allocated task and associated thread_struct memory are
freed via free_task() -> arch_release_task_struct(). This results in
a double free of the guarded storage and runtime info structs
because the fields in the failed task still refer to memory
associated with the source task.
This problem can manifest as a BUG_ON() in set_freepointer() (with
CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
when running trinity syscall fuzz tests on s390x. To avoid this
problem, clear the associated pointer fields in
arch_dup_task_struct() immediately after the new task is copied.
Note that the RI flag is still cleared in copy_thread() because it
resides in thread stack memory and that is where stack info is
copied.
Signed-off-by: Brian Foster <[email protected]>
---
Hi all,
Note that I'm not subscribed to the list so please CC on reply. Further,
I'm not terribly familiar with these associated features and so have not
run any kind of functional testing here. My testing was purely around
producing/preventing the double free issue. Any thoughts, reviews or
further testing is appreciated. Thanks.
Brian
arch/s390/kernel/process.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
index 89949b9f3cf8..d5119e039d85 100644
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -91,6 +91,18 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
memcpy(dst, src, arch_task_struct_size);
dst->thread.fpu.regs = dst->thread.fpu.fprs;
+
+ /*
+ * Don't transfer over the runtime instrumentation or the guarded
+ * storage control block pointers. These fields are cleared here instead
+ * of in copy_thread() to avoid premature freeing of associated memory
+ * on fork() failure. Wait to clear the RI flag because ->stack still
+ * refers to the source thread.
+ */
+ dst->thread.ri_cb = NULL;
+ dst->thread.gs_cb = NULL;
+ dst->thread.gs_bc_cb = NULL;
+
return 0;
}
@@ -150,13 +162,11 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
frame->childregs.flags = 0;
if (new_stackp)
frame->childregs.gprs[15] = new_stackp;
-
- /* Don't copy runtime instrumentation info */
- p->thread.ri_cb = NULL;
+ /*
+ * Clear the runtime instrumentation flag after the above childregs
+ * copy. The CB pointer was already cleared in arch_dup_task_struct().
+ */
frame->childregs.psw.mask &= ~PSW_MASK_RI;
- /* Don't copy guarded storage control block */
- p->thread.gs_cb = NULL;
- p->thread.gs_bc_cb = NULL;
/* Set a new TLS ? */
if (clone_flags & CLONE_SETTLS) {
--
2.37.1
On Tue, 16 Aug 2022 11:54:07 -0400
Brian Foster <[email protected]> wrote:
> The pointers for guarded storage and runtime instrumentation control
> blocks are stored in the thread_struct of the associated task. These
> pointers are initially copied on fork() via arch_dup_task_struct()
> and then cleared via copy_thread() before fork() returns. If fork()
> happens to fail after the initial task dup and before copy_thread(),
> the newly allocated task and associated thread_struct memory are
> freed via free_task() -> arch_release_task_struct(). This results in
> a double free of the guarded storage and runtime info structs
> because the fields in the failed task still refer to memory
> associated with the source task.
>
> This problem can manifest as a BUG_ON() in set_freepointer() (with
> CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
> when running trinity syscall fuzz tests on s390x. To avoid this
> problem, clear the associated pointer fields in
> arch_dup_task_struct() immediately after the new task is copied.
> Note that the RI flag is still cleared in copy_thread() because it
> resides in thread stack memory and that is where stack info is
> copied.
>
> Signed-off-by: Brian Foster <[email protected]>
> ---
>
> Hi all,
>
> Note that I'm not subscribed to the list so please CC on reply. Further,
> I'm not terribly familiar with these associated features and so have not
> run any kind of functional testing here. My testing was purely around
> producing/preventing the double free issue. Any thoughts, reviews or
> further testing is appreciated. Thanks.
>
> Brian
>
> arch/s390/kernel/process.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
> index 89949b9f3cf8..d5119e039d85 100644
> --- a/arch/s390/kernel/process.c
> +++ b/arch/s390/kernel/process.c
> @@ -91,6 +91,18 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
>
> memcpy(dst, src, arch_task_struct_size);
> dst->thread.fpu.regs = dst->thread.fpu.fprs;
> +
> + /*
> + * Don't transfer over the runtime instrumentation or the guarded
> + * storage control block pointers. These fields are cleared here instead
> + * of in copy_thread() to avoid premature freeing of associated memory
> + * on fork() failure. Wait to clear the RI flag because ->stack still
> + * refers to the source thread.
> + */
> + dst->thread.ri_cb = NULL;
> + dst->thread.gs_cb = NULL;
> + dst->thread.gs_bc_cb = NULL;
> +
> return 0;
> }
>
> @@ -150,13 +162,11 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> frame->childregs.flags = 0;
> if (new_stackp)
> frame->childregs.gprs[15] = new_stackp;
> -
> - /* Don't copy runtime instrumentation info */
> - p->thread.ri_cb = NULL;
> + /*
> + * Clear the runtime instrumentation flag after the above childregs
> + * copy. The CB pointer was already cleared in arch_dup_task_struct().
> + */
> frame->childregs.psw.mask &= ~PSW_MASK_RI;
> - /* Don't copy guarded storage control block */
> - p->thread.gs_cb = NULL;
> - p->thread.gs_bc_cb = NULL;
>
> /* Set a new TLS ? */
> if (clone_flags & CLONE_SETTLS) {
Thanks Brian, nice catch! Looks good to me. For completeness, we should
add stable / Fixes tags, like this:
Fixes: 8d9047f8b967c ("s390/runtime instrumentation: simplify task exit handling")
Fixes: 7b83c6297d2fc ("s390/guarded storage: simplify task exit handling")
Cc: <[email protected]> # 4.15
Not 100% sure about the Fixes tags, Heiko should also have a look when
he returns next week.
Reviewed-by: Gerald Schaefer <[email protected]>
On Wed, Aug 17, 2022 at 01:09:06PM +0200, Gerald Schaefer wrote:
> On Tue, 16 Aug 2022 11:54:07 -0400
> Brian Foster <[email protected]> wrote:
>
> > The pointers for guarded storage and runtime instrumentation control
> > blocks are stored in the thread_struct of the associated task. These
> > pointers are initially copied on fork() via arch_dup_task_struct()
> > and then cleared via copy_thread() before fork() returns. If fork()
> > happens to fail after the initial task dup and before copy_thread(),
> > the newly allocated task and associated thread_struct memory are
> > freed via free_task() -> arch_release_task_struct(). This results in
> > a double free of the guarded storage and runtime info structs
> > because the fields in the failed task still refer to memory
> > associated with the source task.
> >
> > This problem can manifest as a BUG_ON() in set_freepointer() (with
> > CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
> > when running trinity syscall fuzz tests on s390x. To avoid this
> > problem, clear the associated pointer fields in
> > arch_dup_task_struct() immediately after the new task is copied.
> > Note that the RI flag is still cleared in copy_thread() because it
> > resides in thread stack memory and that is where stack info is
> > copied.
> >
> > Signed-off-by: Brian Foster <[email protected]>
> > ---
> >
> > Hi all,
> >
> > Note that I'm not subscribed to the list so please CC on reply. Further,
> > I'm not terribly familiar with these associated features and so have not
> > run any kind of functional testing here. My testing was purely around
> > producing/preventing the double free issue. Any thoughts, reviews or
> > further testing is appreciated. Thanks.
> >
> > Brian
> >
> > arch/s390/kernel/process.c | 22 ++++++++++++++++------
> > 1 file changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
> > index 89949b9f3cf8..d5119e039d85 100644
> > --- a/arch/s390/kernel/process.c
> > +++ b/arch/s390/kernel/process.c
> > @@ -91,6 +91,18 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
> >
> > memcpy(dst, src, arch_task_struct_size);
> > dst->thread.fpu.regs = dst->thread.fpu.fprs;
> > +
> > + /*
> > + * Don't transfer over the runtime instrumentation or the guarded
> > + * storage control block pointers. These fields are cleared here instead
> > + * of in copy_thread() to avoid premature freeing of associated memory
> > + * on fork() failure. Wait to clear the RI flag because ->stack still
> > + * refers to the source thread.
> > + */
> > + dst->thread.ri_cb = NULL;
> > + dst->thread.gs_cb = NULL;
> > + dst->thread.gs_bc_cb = NULL;
> > +
> > return 0;
> > }
> >
> > @@ -150,13 +162,11 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> > frame->childregs.flags = 0;
> > if (new_stackp)
> > frame->childregs.gprs[15] = new_stackp;
> > -
> > - /* Don't copy runtime instrumentation info */
> > - p->thread.ri_cb = NULL;
> > + /*
> > + * Clear the runtime instrumentation flag after the above childregs
> > + * copy. The CB pointer was already cleared in arch_dup_task_struct().
> > + */
> > frame->childregs.psw.mask &= ~PSW_MASK_RI;
> > - /* Don't copy guarded storage control block */
> > - p->thread.gs_cb = NULL;
> > - p->thread.gs_bc_cb = NULL;
> >
> > /* Set a new TLS ? */
> > if (clone_flags & CLONE_SETTLS) {
>
> Thanks Brian, nice catch! Looks good to me. For completeness, we should
> add stable / Fixes tags, like this:
>
> Fixes: 8d9047f8b967c ("s390/runtime instrumentation: simplify task exit handling")
> Fixes: 7b83c6297d2fc ("s390/guarded storage: simplify task exit handling")
> Cc: <[email protected]> # 4.15
>
> Not 100% sure about the Fixes tags, Heiko should also have a look when
> he returns next week.
>
> Reviewed-by: Gerald Schaefer <[email protected]>
>
Hi Gerald,
Sounds good, thanks for the review. I'll wait a bit so Heiko has a
chance to take a look and then follow up with the tag updates and any
other necessary changes..
Brian
On Tue, Aug 16, 2022 at 11:54:07AM -0400, Brian Foster wrote:
> The pointers for guarded storage and runtime instrumentation control
> blocks are stored in the thread_struct of the associated task. These
> pointers are initially copied on fork() via arch_dup_task_struct()
> and then cleared via copy_thread() before fork() returns. If fork()
> happens to fail after the initial task dup and before copy_thread(),
> the newly allocated task and associated thread_struct memory are
> freed via free_task() -> arch_release_task_struct(). This results in
> a double free of the guarded storage and runtime info structs
> because the fields in the failed task still refer to memory
> associated with the source task.
>
> This problem can manifest as a BUG_ON() in set_freepointer() (with
> CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
> when running trinity syscall fuzz tests on s390x. To avoid this
> problem, clear the associated pointer fields in
> arch_dup_task_struct() immediately after the new task is copied.
> Note that the RI flag is still cleared in copy_thread() because it
> resides in thread stack memory and that is where stack info is
> copied.
>
> Signed-off-by: Brian Foster <[email protected]>
> ---
> arch/s390/kernel/process.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
Reviewed-by: Heiko Carstens <[email protected]>
Thanks for debugging and fixing this!
Vasily, could you pick this one up, and also add the Fixes: and Cc:
stable tags as proposed by Gerald?
Thanks!
On Wed, Aug 24, 2022 at 10:01:00AM +0200, Heiko Carstens wrote:
> On Tue, Aug 16, 2022 at 11:54:07AM -0400, Brian Foster wrote:
> > The pointers for guarded storage and runtime instrumentation control
> > blocks are stored in the thread_struct of the associated task. These
> > pointers are initially copied on fork() via arch_dup_task_struct()
> > and then cleared via copy_thread() before fork() returns. If fork()
> > happens to fail after the initial task dup and before copy_thread(),
> > the newly allocated task and associated thread_struct memory are
> > freed via free_task() -> arch_release_task_struct(). This results in
> > a double free of the guarded storage and runtime info structs
> > because the fields in the failed task still refer to memory
> > associated with the source task.
> >
> > This problem can manifest as a BUG_ON() in set_freepointer() (with
> > CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
> > when running trinity syscall fuzz tests on s390x. To avoid this
> > problem, clear the associated pointer fields in
> > arch_dup_task_struct() immediately after the new task is copied.
> > Note that the RI flag is still cleared in copy_thread() because it
> > resides in thread stack memory and that is where stack info is
> > copied.
> >
> > Signed-off-by: Brian Foster <[email protected]>
> > ---
> > arch/s390/kernel/process.c | 22 ++++++++++++++++------
> > 1 file changed, 16 insertions(+), 6 deletions(-)
>
> Reviewed-by: Heiko Carstens <[email protected]>
>
> Thanks for debugging and fixing this!
>
> Vasily, could you pick this one up, and also add the Fixes: and Cc:
> stable tags as proposed by Gerald?
> Thanks!
Applied, Thank you!