2021-07-25 17:23:34

by Al Viro

[permalink] [raw]
Subject: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()

sigreturn has to deal with an unpleasant problem - exception stack frames
have different sizes, depending upon the exception (and processor model, as
well) and variable-sized part of exception frame may contain information
needed for instruction restart. So when signal handler terminates and calls
sigreturn to resume the execution at the place where we'd been when we caught
the signal, it has to rearrange the frame at the bottom of kernel stack.
Worse, it might need to open a gap in the kernel stack, shifting pt_regs
towards lower addresses.

Doing that from C is insane - we'd need to shift stack frames (return addresses,
local variables, etc.) of C call chain, right under the nose of compiler and
hope it won't fall apart horribly. What had been actually done is only slightly
less insane - an inline asm in mangle_kernel_stack() moved the stuff around,
then reset stack pointer and jumped to label in asm glue.

However, we can avoid all that mess if the asm wrapper we have to use anyway
would reserve some space on the stack between switch_stack and the C stack
frame of do_{rt_,}sigreturn(). Then C part can simply memmove() pt_regs +
switch_stack, memcpy() the variable part of exception frame into the opened
gap - all of that without inline asm, buggering C call chain, magical jumps
to asm labels, etc.

Asm wrapper would need to know where the moved switch_stack has ended up -
it might have been shifted into the gap we'd reserved before do_rt_sigreturn()
call. That's where it needs to set the stack pointer to. So let the C part
return just that and be done with that.

While we are at it, the call of berr_040cleanup() we need to do when
returning via 68040 bus error exception frame can be moved into C part
as well.

Signed-off-by: Al Viro <[email protected]>
---
arch/m68k/68000/entry.S | 3 --
arch/m68k/coldfire/entry.S | 3 --
arch/m68k/include/asm/traps.h | 4 ++
arch/m68k/kernel/entry.S | 55 ++++++++++-----------
arch/m68k/kernel/signal.c | 111 ++++++++++++++++--------------------------
5 files changed, 71 insertions(+), 105 deletions(-)

diff --git a/arch/m68k/68000/entry.S b/arch/m68k/68000/entry.S
index 259b3661b614..cce465e850fe 100644
--- a/arch/m68k/68000/entry.S
+++ b/arch/m68k/68000/entry.S
@@ -25,7 +25,6 @@
.globl system_call
.globl resume
.globl ret_from_exception
-.globl ret_from_signal
.globl sys_call_table
.globl bad_interrupt
.globl inthandler1
@@ -59,8 +58,6 @@ do_trace:
subql #4,%sp /* dummy return address */
SAVE_SWITCH_STACK
jbsr syscall_trace_leave
-
-ret_from_signal:
RESTORE_SWITCH_STACK
addql #4,%sp
jra ret_from_exception
diff --git a/arch/m68k/coldfire/entry.S b/arch/m68k/coldfire/entry.S
index d43a02795a4a..68adb7b5b296 100644
--- a/arch/m68k/coldfire/entry.S
+++ b/arch/m68k/coldfire/entry.S
@@ -51,7 +51,6 @@ sw_usp:
.globl system_call
.globl resume
.globl ret_from_exception
-.globl ret_from_signal
.globl sys_call_table
.globl inthandler

@@ -98,8 +97,6 @@ ENTRY(system_call)
subql #4,%sp /* dummy return address */
SAVE_SWITCH_STACK
jbsr syscall_trace_leave
-
-ret_from_signal:
RESTORE_SWITCH_STACK
addql #4,%sp

diff --git a/arch/m68k/include/asm/traps.h b/arch/m68k/include/asm/traps.h
index 4aff3358fbaf..a9d5c1c870d3 100644
--- a/arch/m68k/include/asm/traps.h
+++ b/arch/m68k/include/asm/traps.h
@@ -267,6 +267,10 @@ struct frame {
} un;
};

+#ifdef CONFIG_M68040
+asmlinkage void berr_040cleanup(struct frame *fp);
+#endif
+
#endif /* __ASSEMBLY__ */

#endif /* _M68K_TRAPS_H */
diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index ff9e842cec0f..8fa9822b5922 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -78,20 +78,38 @@ ENTRY(__sys_clone3)

ENTRY(sys_sigreturn)
SAVE_SWITCH_STACK
- movel %sp,%sp@- | switch_stack pointer
- pea %sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
+ movel %sp,%a1 | switch_stack pointer
+ lea %sp@(SWITCH_STACK_SIZE),%a0 | pt_regs pointer
+ lea %sp@(-84),%sp | leave a gap
+ movel %a1,%sp@-
+ movel %a0,%sp@-
jbsr do_sigreturn
- addql #8,%sp
- RESTORE_SWITCH_STACK
- rts
+ jra 1f | shared with rt_sigreturn()

ENTRY(sys_rt_sigreturn)
SAVE_SWITCH_STACK
- movel %sp,%sp@- | switch_stack pointer
- pea %sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
+ movel %sp,%a1 | switch_stack pointer
+ lea %sp@(SWITCH_STACK_SIZE),%a0 | pt_regs pointer
+ lea %sp@(-84),%sp | leave a gap
+ movel %a1,%sp@-
+ movel %a0,%sp@-
+ | stack contents:
+ | [original pt_regs address] [original switch_stack address]
+ | [gap] [switch_stack] [pt_regs] [exception frame]
jbsr do_rt_sigreturn
- addql #8,%sp
+
+1:
+ | stack contents now:
+ | [original pt_regs address] [original switch_stack address]
+ | [unused part of the gap] [moved switch_stack] [moved pt_regs]
+ | [replacement exception frame]
+ | return value of do_{rt_,}sigreturn() points to moved switch_stack.
+
+ movel %d0,%sp | discard the leftover junk
RESTORE_SWITCH_STACK
+ | stack contents now is just [syscall return address] [pt_regs] [frame]
+ | return pt_regs.d0
+ movel %sp@(PT_OFF_D0+4),%d0
rts

ENTRY(buserr)
@@ -182,27 +200,6 @@ do_trace_exit:
addql #4,%sp
jra .Lret_from_exception

-ENTRY(ret_from_signal)
- movel %curptr@(TASK_STACK),%a1
- tstb %a1@(TINFO_FLAGS+2)
- jge 1f
- lea %sp@(SWITCH_STACK_SIZE),%a1
- movel %a1,%curptr@(TASK_THREAD+THREAD_ESP0)
- jbsr syscall_trace
-1: RESTORE_SWITCH_STACK
- addql #4,%sp
-/* on 68040 complete pending writebacks if any */
-#ifdef CONFIG_M68040
- bfextu %sp@(PT_OFF_FORMATVEC){#0,#4},%d0
- subql #7,%d0 | bus error frame ?
- jbne 1f
- movel %sp,%sp@-
- jbsr berr_040cleanup
- addql #4,%sp
-1:
-#endif
- jra .Lret_from_exception
-
ENTRY(system_call)
SAVE_ALL_SYS

diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index cd11eb101eac..338817d0cb3f 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -641,56 +641,35 @@ static inline void siginfo_build_tests(void)
static int mangle_kernel_stack(struct pt_regs *regs, int formatvec,
void __user *fp)
{
- int fsize = frame_extra_sizes(formatvec >> 12);
- if (fsize < 0) {
+ int extra = frame_extra_sizes(formatvec >> 12);
+ char buf[sizeof_field(struct frame, un)];
+
+ if (extra < 0) {
/*
* user process trying to return with weird frame format
*/
pr_debug("user process returning with weird frame format\n");
- return 1;
+ return -1;
}
- if (!fsize) {
- regs->format = formatvec >> 12;
- regs->vector = formatvec & 0xfff;
- } else {
- struct switch_stack *sw = (struct switch_stack *)regs - 1;
- /* yes, twice as much as max(sizeof(frame.un.fmt<x>)) */
- unsigned long buf[sizeof_field(struct frame, un) / 2];
-
- /* that'll make sure that expansion won't crap over data */
- if (copy_from_user(buf + fsize / 4, fp, fsize))
- return 1;
-
- /* point of no return */
- regs->format = formatvec >> 12;
- regs->vector = formatvec & 0xfff;
-#define frame_offset (sizeof(struct pt_regs)+sizeof(struct switch_stack))
- __asm__ __volatile__ (
-#ifdef CONFIG_COLDFIRE
- " movel %0,%/sp\n\t"
- " bra ret_from_signal\n"
-#else
- " movel %0,%/a0\n\t"
- " subl %1,%/a0\n\t" /* make room on stack */
- " movel %/a0,%/sp\n\t" /* set stack pointer */
- /* move switch_stack and pt_regs */
- "1: movel %0@+,%/a0@+\n\t"
- " dbra %2,1b\n\t"
- " lea %/sp@(%c3),%/a0\n\t" /* add offset of fmt */
- " lsrl #2,%1\n\t"
- " subql #1,%1\n\t"
- /* copy to the gap we'd made */
- "2: movel %4@+,%/a0@+\n\t"
- " dbra %1,2b\n\t"
- " bral ret_from_signal\n"
+ if (extra && copy_from_user(buf, fp, extra))
+ return -1;
+ regs->format = formatvec >> 12;
+ regs->vector = formatvec & 0xfff;
+ if (extra) {
+ void *p = (struct switch_stack *)regs - 1;
+ struct frame *new = (void *)regs - extra;
+ int size = sizeof(struct pt_regs)+sizeof(struct switch_stack);
+
+ memmove(p - extra, p, size);
+ memcpy(p - extra + size, buf, extra);
+ current->thread.esp0 = (unsigned long)&new->ptregs;
+#ifdef CONFIG_M68040
+ /* on 68040 complete pending writebacks if any */
+ if (new->ptregs.format == 7) // bus error frame
+ berr_040cleanup(new);
#endif
- : /* no outputs, it doesn't ever return */
- : "a" (sw), "d" (fsize), "d" (frame_offset/4-1),
- "n" (frame_offset), "a" (buf + fsize/4)
- : "a0");
-#undef frame_offset
}
- return 0;
+ return extra;
}

static inline int
@@ -698,7 +677,6 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
{
int formatvec;
struct sigcontext context;
- int err = 0;

siginfo_build_tests();

@@ -707,7 +685,7 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u

/* get previous context */
if (copy_from_user(&context, usc, sizeof(context)))
- goto badframe;
+ return -1;

/* restore passed registers */
regs->d0 = context.sc_d0;
@@ -720,15 +698,10 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
wrusp(context.sc_usp);
formatvec = context.sc_formatvec;

- err = restore_fpu_state(&context);
-
- if (err || mangle_kernel_stack(regs, formatvec, fp))
- goto badframe;
-
- return 0;
+ if (restore_fpu_state(&context))
+ return -1;

-badframe:
- return 1;
+ return mangle_kernel_stack(regs, formatvec, fp);
}

static inline int
@@ -745,7 +718,7 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,

err = __get_user(temp, &uc->uc_mcontext.version);
if (temp != MCONTEXT_VERSION)
- goto badframe;
+ return -1;
/* restore passed registers */
err |= __get_user(regs->d0, &gregs[0]);
err |= __get_user(regs->d1, &gregs[1]);
@@ -774,22 +747,17 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,
err |= restore_altstack(&uc->uc_stack);

if (err)
- goto badframe;
-
- if (mangle_kernel_stack(regs, temp, &uc->uc_extra))
- goto badframe;
+ return -1;

- return 0;
-
-badframe:
- return 1;
+ return mangle_kernel_stack(regs, temp, &uc->uc_extra);
}

-asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
+asmlinkage void *do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
{
unsigned long usp = rdusp();
struct sigframe __user *frame = (struct sigframe __user *)(usp - 4);
sigset_t set;
+ int size;

if (!access_ok(frame, sizeof(*frame)))
goto badframe;
@@ -801,20 +769,22 @@ asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)

set_current_blocked(&set);

- if (restore_sigcontext(regs, &frame->sc, frame + 1))
+ size = restore_sigcontext(regs, &frame->sc, frame + 1);
+ if (size < 0)
goto badframe;
- return regs->d0;
+ return (void *)sw - size;

badframe:
force_sig(SIGSEGV);
- return 0;
+ return sw;
}

-asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
+asmlinkage void *do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
{
unsigned long usp = rdusp();
struct rt_sigframe __user *frame = (struct rt_sigframe __user *)(usp - 4);
sigset_t set;
+ int size;

if (!access_ok(frame, sizeof(*frame)))
goto badframe;
@@ -823,13 +793,14 @@ asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)

set_current_blocked(&set);

- if (rt_restore_ucontext(regs, sw, &frame->uc))
+ size = rt_restore_ucontext(regs, sw, &frame->uc);
+ if (size < 0)
goto badframe;
- return regs->d0;
+ return (void *)sw - size;

badframe:
force_sig(SIGSEGV);
- return 0;
+ return sw;
}

static inline struct pt_regs *rte_regs(struct pt_regs *regs)
--
2.11.0


2021-09-15 23:39:30

by Michael Schmitz

[permalink] [raw]
Subject: Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()

Hi Al,

On 26/07/21 5:20 am, Al Viro wrote:
> sigreturn has to deal with an unpleasant problem - exception stack frames
> have different sizes, depending upon the exception (and processor model, as
> well) and variable-sized part of exception frame may contain information
> needed for instruction restart. So when signal handler terminates and calls
> sigreturn to resume the execution at the place where we'd been when we caught
> the signal, it has to rearrange the frame at the bottom of kernel stack.
> Worse, it might need to open a gap in the kernel stack, shifting pt_regs
> towards lower addresses.
>
> Doing that from C is insane - we'd need to shift stack frames (return addresses,
> local variables, etc.) of C call chain, right under the nose of compiler and
> hope it won't fall apart horribly. What had been actually done is only slightly
> less insane - an inline asm in mangle_kernel_stack() moved the stuff around,
> then reset stack pointer and jumped to label in asm glue.
>
> However, we can avoid all that mess if the asm wrapper we have to use anyway
> would reserve some space on the stack between switch_stack and the C stack
> frame of do_{rt_,}sigreturn(). Then C part can simply memmove() pt_regs +
> switch_stack, memcpy() the variable part of exception frame into the opened
> gap - all of that without inline asm, buggering C call chain, magical jumps
> to asm labels, etc.
>
> Asm wrapper would need to know where the moved switch_stack has ended up -
> it might have been shifted into the gap we'd reserved before do_rt_sigreturn()
> call. That's where it needs to set the stack pointer to. So let the C part
> return just that and be done with that.
>
> While we are at it, the call of berr_040cleanup() we need to do when
> returning via 68040 bus error exception frame can be moved into C part
> as well.
>
> Signed-off-by: Al Viro <[email protected]>

This one's a little harder - you use a 84 byte gap on each sigreturn, no
matter what the frame size we need to restore. The original
mangle_kernel_stack() only makes room on the stack when it has no other
option (using twice as much size - correct me if I'm wrong).

Ideally, we'd only leave a gap for mangle_kernel_stack() to use if the
frame size requires us to do so. Working that out in asm glue would be
sufficiently convoluted as to cancel out the benefits of cleaning up the
C sigreturn part. Probably not worth it.

Looking at how likely it is that we'll see multiple pending signals, I'd
say from the frequency of the resulting panic messages in the past four
years it's quite rare. I've never seen these faults on anything but a
fully loaded system running some sort of stress test. OTOH, the
pathological case of exception frame (mid-instruction access fault) is
expected when we might want to send a segfault signal, which may happen
anytime.

On balance, I think the extra stack use will occur rare enough and the
benefit of cleaning up mangle_kernel_stack() outweighs that.

Tested-by: Michael Schmitz <[email protected]>

Reviewed-by: Michael Schmitz <[email protected]>



> ---
> arch/m68k/68000/entry.S | 3 --
> arch/m68k/coldfire/entry.S | 3 --
> arch/m68k/include/asm/traps.h | 4 ++
> arch/m68k/kernel/entry.S | 55 ++++++++++-----------
> arch/m68k/kernel/signal.c | 111 ++++++++++++++++--------------------------
> 5 files changed, 71 insertions(+), 105 deletions(-)
>
> diff --git a/arch/m68k/68000/entry.S b/arch/m68k/68000/entry.S
> index 259b3661b614..cce465e850fe 100644
> --- a/arch/m68k/68000/entry.S
> +++ b/arch/m68k/68000/entry.S
> @@ -25,7 +25,6 @@
> .globl system_call
> .globl resume
> .globl ret_from_exception
> -.globl ret_from_signal
> .globl sys_call_table
> .globl bad_interrupt
> .globl inthandler1
> @@ -59,8 +58,6 @@ do_trace:
> subql #4,%sp /* dummy return address */
> SAVE_SWITCH_STACK
> jbsr syscall_trace_leave
> -
> -ret_from_signal:
> RESTORE_SWITCH_STACK
> addql #4,%sp
> jra ret_from_exception
> diff --git a/arch/m68k/coldfire/entry.S b/arch/m68k/coldfire/entry.S
> index d43a02795a4a..68adb7b5b296 100644
> --- a/arch/m68k/coldfire/entry.S
> +++ b/arch/m68k/coldfire/entry.S
> @@ -51,7 +51,6 @@ sw_usp:
> .globl system_call
> .globl resume
> .globl ret_from_exception
> -.globl ret_from_signal
> .globl sys_call_table
> .globl inthandler
>
> @@ -98,8 +97,6 @@ ENTRY(system_call)
> subql #4,%sp /* dummy return address */
> SAVE_SWITCH_STACK
> jbsr syscall_trace_leave
> -
> -ret_from_signal:
> RESTORE_SWITCH_STACK
> addql #4,%sp
>
> diff --git a/arch/m68k/include/asm/traps.h b/arch/m68k/include/asm/traps.h
> index 4aff3358fbaf..a9d5c1c870d3 100644
> --- a/arch/m68k/include/asm/traps.h
> +++ b/arch/m68k/include/asm/traps.h
> @@ -267,6 +267,10 @@ struct frame {
> } un;
> };
>
> +#ifdef CONFIG_M68040
> +asmlinkage void berr_040cleanup(struct frame *fp);
> +#endif
> +
> #endif /* __ASSEMBLY__ */
>
> #endif /* _M68K_TRAPS_H */
> diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
> index ff9e842cec0f..8fa9822b5922 100644
> --- a/arch/m68k/kernel/entry.S
> +++ b/arch/m68k/kernel/entry.S
> @@ -78,20 +78,38 @@ ENTRY(__sys_clone3)
>
> ENTRY(sys_sigreturn)
> SAVE_SWITCH_STACK
> - movel %sp,%sp@- | switch_stack pointer
> - pea %sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
> + movel %sp,%a1 | switch_stack pointer
> + lea %sp@(SWITCH_STACK_SIZE),%a0 | pt_regs pointer
> + lea %sp@(-84),%sp | leave a gap
> + movel %a1,%sp@-
> + movel %a0,%sp@-
> jbsr do_sigreturn
> - addql #8,%sp
> - RESTORE_SWITCH_STACK
> - rts
> + jra 1f | shared with rt_sigreturn()
>
> ENTRY(sys_rt_sigreturn)
> SAVE_SWITCH_STACK
> - movel %sp,%sp@- | switch_stack pointer
> - pea %sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
> + movel %sp,%a1 | switch_stack pointer
> + lea %sp@(SWITCH_STACK_SIZE),%a0 | pt_regs pointer
> + lea %sp@(-84),%sp | leave a gap
> + movel %a1,%sp@-
> + movel %a0,%sp@-
> + | stack contents:
> + | [original pt_regs address] [original switch_stack address]
> + | [gap] [switch_stack] [pt_regs] [exception frame]
> jbsr do_rt_sigreturn
> - addql #8,%sp
> +
> +1:
> + | stack contents now:
> + | [original pt_regs address] [original switch_stack address]
> + | [unused part of the gap] [moved switch_stack] [moved pt_regs]
> + | [replacement exception frame]
> + | return value of do_{rt_,}sigreturn() points to moved switch_stack.
> +
> + movel %d0,%sp | discard the leftover junk
> RESTORE_SWITCH_STACK
> + | stack contents now is just [syscall return address] [pt_regs] [frame]
> + | return pt_regs.d0
> + movel %sp@(PT_OFF_D0+4),%d0
> rts
>
> ENTRY(buserr)
> @@ -182,27 +200,6 @@ do_trace_exit:
> addql #4,%sp
> jra .Lret_from_exception
>
> -ENTRY(ret_from_signal)
> - movel %curptr@(TASK_STACK),%a1
> - tstb %a1@(TINFO_FLAGS+2)
> - jge 1f
> - lea %sp@(SWITCH_STACK_SIZE),%a1
> - movel %a1,%curptr@(TASK_THREAD+THREAD_ESP0)
> - jbsr syscall_trace
> -1: RESTORE_SWITCH_STACK
> - addql #4,%sp
> -/* on 68040 complete pending writebacks if any */
> -#ifdef CONFIG_M68040
> - bfextu %sp@(PT_OFF_FORMATVEC){#0,#4},%d0
> - subql #7,%d0 | bus error frame ?
> - jbne 1f
> - movel %sp,%sp@-
> - jbsr berr_040cleanup
> - addql #4,%sp
> -1:
> -#endif
> - jra .Lret_from_exception
> -
> ENTRY(system_call)
> SAVE_ALL_SYS
>
> diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
> index cd11eb101eac..338817d0cb3f 100644
> --- a/arch/m68k/kernel/signal.c
> +++ b/arch/m68k/kernel/signal.c
> @@ -641,56 +641,35 @@ static inline void siginfo_build_tests(void)
> static int mangle_kernel_stack(struct pt_regs *regs, int formatvec,
> void __user *fp)
> {
> - int fsize = frame_extra_sizes(formatvec >> 12);
> - if (fsize < 0) {
> + int extra = frame_extra_sizes(formatvec >> 12);
> + char buf[sizeof_field(struct frame, un)];
> +
> + if (extra < 0) {
> /*
> * user process trying to return with weird frame format
> */
> pr_debug("user process returning with weird frame format\n");
> - return 1;
> + return -1;
> }
> - if (!fsize) {
> - regs->format = formatvec >> 12;
> - regs->vector = formatvec & 0xfff;
> - } else {
> - struct switch_stack *sw = (struct switch_stack *)regs - 1;
> - /* yes, twice as much as max(sizeof(frame.un.fmt<x>)) */
> - unsigned long buf[sizeof_field(struct frame, un) / 2];
> -
> - /* that'll make sure that expansion won't crap over data */
> - if (copy_from_user(buf + fsize / 4, fp, fsize))
> - return 1;
> -
> - /* point of no return */
> - regs->format = formatvec >> 12;
> - regs->vector = formatvec & 0xfff;
> -#define frame_offset (sizeof(struct pt_regs)+sizeof(struct switch_stack))
> - __asm__ __volatile__ (
> -#ifdef CONFIG_COLDFIRE
> - " movel %0,%/sp\n\t"
> - " bra ret_from_signal\n"
> -#else
> - " movel %0,%/a0\n\t"
> - " subl %1,%/a0\n\t" /* make room on stack */
> - " movel %/a0,%/sp\n\t" /* set stack pointer */
> - /* move switch_stack and pt_regs */
> - "1: movel %0@+,%/a0@+\n\t"
> - " dbra %2,1b\n\t"
> - " lea %/sp@(%c3),%/a0\n\t" /* add offset of fmt */
> - " lsrl #2,%1\n\t"
> - " subql #1,%1\n\t"
> - /* copy to the gap we'd made */
> - "2: movel %4@+,%/a0@+\n\t"
> - " dbra %1,2b\n\t"
> - " bral ret_from_signal\n"
> + if (extra && copy_from_user(buf, fp, extra))
> + return -1;
> + regs->format = formatvec >> 12;
> + regs->vector = formatvec & 0xfff;
> + if (extra) {
> + void *p = (struct switch_stack *)regs - 1;
> + struct frame *new = (void *)regs - extra;
> + int size = sizeof(struct pt_regs)+sizeof(struct switch_stack);
> +
> + memmove(p - extra, p, size);
> + memcpy(p - extra + size, buf, extra);
> + current->thread.esp0 = (unsigned long)&new->ptregs;
> +#ifdef CONFIG_M68040
> + /* on 68040 complete pending writebacks if any */
> + if (new->ptregs.format == 7) // bus error frame
> + berr_040cleanup(new);
> #endif
> - : /* no outputs, it doesn't ever return */
> - : "a" (sw), "d" (fsize), "d" (frame_offset/4-1),
> - "n" (frame_offset), "a" (buf + fsize/4)
> - : "a0");
> -#undef frame_offset
> }
> - return 0;
> + return extra;
> }
>
> static inline int
> @@ -698,7 +677,6 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
> {
> int formatvec;
> struct sigcontext context;
> - int err = 0;
>
> siginfo_build_tests();
>
> @@ -707,7 +685,7 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
>
> /* get previous context */
> if (copy_from_user(&context, usc, sizeof(context)))
> - goto badframe;
> + return -1;
>
> /* restore passed registers */
> regs->d0 = context.sc_d0;
> @@ -720,15 +698,10 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
> wrusp(context.sc_usp);
> formatvec = context.sc_formatvec;
>
> - err = restore_fpu_state(&context);
> -
> - if (err || mangle_kernel_stack(regs, formatvec, fp))
> - goto badframe;
> -
> - return 0;
> + if (restore_fpu_state(&context))
> + return -1;
>
> -badframe:
> - return 1;
> + return mangle_kernel_stack(regs, formatvec, fp);
> }
>
> static inline int
> @@ -745,7 +718,7 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,
>
> err = __get_user(temp, &uc->uc_mcontext.version);
> if (temp != MCONTEXT_VERSION)
> - goto badframe;
> + return -1;
> /* restore passed registers */
> err |= __get_user(regs->d0, &gregs[0]);
> err |= __get_user(regs->d1, &gregs[1]);
> @@ -774,22 +747,17 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,
> err |= restore_altstack(&uc->uc_stack);
>
> if (err)
> - goto badframe;
> -
> - if (mangle_kernel_stack(regs, temp, &uc->uc_extra))
> - goto badframe;
> + return -1;
>
> - return 0;
> -
> -badframe:
> - return 1;
> + return mangle_kernel_stack(regs, temp, &uc->uc_extra);
> }
>
> -asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
> +asmlinkage void *do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
> {
> unsigned long usp = rdusp();
> struct sigframe __user *frame = (struct sigframe __user *)(usp - 4);
> sigset_t set;
> + int size;
>
> if (!access_ok(frame, sizeof(*frame)))
> goto badframe;
> @@ -801,20 +769,22 @@ asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
>
> set_current_blocked(&set);
>
> - if (restore_sigcontext(regs, &frame->sc, frame + 1))
> + size = restore_sigcontext(regs, &frame->sc, frame + 1);
> + if (size < 0)
> goto badframe;
> - return regs->d0;
> + return (void *)sw - size;
>
> badframe:
> force_sig(SIGSEGV);
> - return 0;
> + return sw;
> }
>
> -asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
> +asmlinkage void *do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
> {
> unsigned long usp = rdusp();
> struct rt_sigframe __user *frame = (struct rt_sigframe __user *)(usp - 4);
> sigset_t set;
> + int size;
>
> if (!access_ok(frame, sizeof(*frame)))
> goto badframe;
> @@ -823,13 +793,14 @@ asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
>
> set_current_blocked(&set);
>
> - if (rt_restore_ucontext(regs, sw, &frame->uc))
> + size = rt_restore_ucontext(regs, sw, &frame->uc);
> + if (size < 0)
> goto badframe;
> - return regs->d0;
> + return (void *)sw - size;
>
> badframe:
> force_sig(SIGSEGV);
> - return 0;
> + return sw;
> }
>
> static inline struct pt_regs *rte_regs(struct pt_regs *regs)

2021-09-16 00:21:28

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()

On Thu, Sep 16, 2021 at 11:35:05AM +1200, Michael Schmitz wrote:

> This one's a little harder - you use a 84 byte gap on each sigreturn, no
> matter what the frame size we need to restore. The original
> mangle_kernel_stack() only makes room on the stack when it has no other
> option (using twice as much size - correct me if I'm wrong).
>
> Ideally, we'd only leave a gap for mangle_kernel_stack() to use if the frame
> size requires us to do so. Working that out in asm glue would be
> sufficiently convoluted as to cancel out the benefits of cleaning up the C
> sigreturn part. Probably not worth it.

You'd need to
* load the frame type from sigcontext (and deal with EFAULT, etc.)
* make decision based on that
* pass the type down into sigreturn(), so we wouldn't run into
mismatches.

And all that just to avoid a single "subtract a constant from stack pointer"
insn. We are on a very shallow kernel stack here - it's a syscall entry,
after all. And the stack footprint of do_sigreturn() is fairly small - e.g.
stat(2) eats a lot more.

We are not initializing the gap either - it's just reserved on stack; we only
access it if we need to enlarge the stack frame.

IOW, what would be the benefit of trying to avoid unconditional gap there?

2021-09-16 00:57:25

by Michael Schmitz

[permalink] [raw]
Subject: Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()

Hi Al,

On 16/09/21 12:19, Al Viro wrote:
> On Thu, Sep 16, 2021 at 11:35:05AM +1200, Michael Schmitz wrote:
>
>> This one's a little harder - you use a 84 byte gap on each sigreturn, no
>> matter what the frame size we need to restore. The original
>> mangle_kernel_stack() only makes room on the stack when it has no other
>> option (using twice as much size - correct me if I'm wrong).
>>
>> Ideally, we'd only leave a gap for mangle_kernel_stack() to use if the frame
>> size requires us to do so. Working that out in asm glue would be
>> sufficiently convoluted as to cancel out the benefits of cleaning up the C
>> sigreturn part. Probably not worth it.
>
> You'd need to
> * load the frame type from sigcontext (and deal with EFAULT, etc.)
> * make decision based on that
> * pass the type down into sigreturn(), so we wouldn't run into
> mismatches.
>
> And all that just to avoid a single "subtract a constant from stack pointer"
> insn. We are on a very shallow kernel stack here - it's a syscall entry,
> after all. And the stack footprint of do_sigreturn() is fairly small - e.g.
> stat(2) eats a lot more.

Thanks, that's what I was wondering. Not worth the extra complexity then.

>
> We are not initializing the gap either - it's just reserved on stack; we only
> access it if we need to enlarge the stack frame.
>
> IOW, what would be the benefit of trying to avoid unconditional gap there?

Avoiding a kernel stack overflow - there are comments in the code that
warn against that, but those may be largely historic...

Cheers,

Michael

2021-09-16 03:23:48

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()

On Thu, Sep 16, 2021 at 12:53:53PM +1200, Michael Schmitz wrote:
> > You'd need to
> > * load the frame type from sigcontext (and deal with EFAULT, etc.)
> > * make decision based on that
> > * pass the type down into sigreturn(), so we wouldn't run into
> > mismatches.
> >
> > And all that just to avoid a single "subtract a constant from stack pointer"
> > insn. We are on a very shallow kernel stack here - it's a syscall entry,
> > after all. And the stack footprint of do_sigreturn() is fairly small - e.g.
> > stat(2) eats a lot more.
>
> Thanks, that's what I was wondering. Not worth the extra complexity then.
>
> >
> > We are not initializing the gap either - it's just reserved on stack; we only
> > access it if we need to enlarge the stack frame.
> >
> > IOW, what would be the benefit of trying to avoid unconditional gap there?
>
> Avoiding a kernel stack overflow - there are comments in the code that warn
> against that, but those may be largely historic...

This is syscall entry; moreover, it critically relies upon the fixed stack
layout - type 0 exception frame + pt_regs + switch_stack + (now) gap.
Followed by fairly shallow C call chain. I suspect that the deepest you
can get there is when you get an unmapped page when reading the sigframe
and go into page fault handling, with call chain going into some filesystem's
->readpage(). If it was that close to stack overflow, we'd see them all
the time in e.g. random net ioctl doing copy_from_user() - that's going
to be deeper. Or in stat(2), for that matter.

2021-09-16 05:05:48

by Michael Schmitz

[permalink] [raw]
Subject: Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()

Hi Al,

On 16/09/21 15:21, Al Viro wrote:
> On Thu, Sep 16, 2021 at 12:53:53PM +1200, Michael Schmitz wrote:
>>> IOW, what would be the benefit of trying to avoid unconditional gap there?
>>
>> Avoiding a kernel stack overflow - there are comments in the code that warn
>> against that, but those may be largely historic...
>
> This is syscall entry; moreover, it critically relies upon the fixed stack
> layout - type 0 exception frame + pt_regs + switch_stack + (now) gap.

AFAIR, the concerns in the comments I saw were about interrupts - come
to think of it, back in the early days, we used to have 'fast' and
'slow' interrupt handlers, with much of the heavy lifting done in the
handler, and slow interrupts allowed to lower the IPL. Probably no
longer relevant.

> Followed by fairly shallow C call chain. I suspect that the deepest you
> can get there is when you get an unmapped page when reading the sigframe
> and go into page fault handling, with call chain going into some filesystem's
> ->readpage(). If it was that close to stack overflow, we'd see them all
> the time in e.g. random net ioctl doing copy_from_user() - that's going
> to be deeper. Or in stat(2), for that matter.

Your points are well taken - I can see now that my concerns are without
merit.

The only question that remains is whether the third patch can also go to
-stable. Most of my testing was with all three patches applied, I can
drop the third one and retest if you're worries the third one is not
appropriate for -stable.

Cheers,

Michael

2021-09-16 16:48:11

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()

On Thu, Sep 16, 2021 at 05:02:22PM +1200, Michael Schmitz wrote:

> The only question that remains is whether the third patch can also go to
> -stable. Most of my testing was with all three patches applied, I can drop
> the third one and retest if you're worries the third one is not appropriate
> for -stable.

Up to m68k folks, really. The current mainline mangle_kernel_stack()
is, er, not nice and the entire area is delicate enough as it is (witness the
bugs dealt with in the rest of the series), but strictly speaking the third
patch is not fixing any functional bugs.