There is no reason for handing the *resched pointer argument through
several functions just to check whether the signal is related to a self
rearming posix timer.
SI_TIMER is only used by the posix timer code and cannot be queued from
user space. The only extra check in collect_signal() to verify whether the
queued signal is preallocated is not really useful. Some other places
already check purely the SI_TIMER type.
Signed-off-by: Thomas Gleixner <[email protected]>
---
kernel/signal.c | 25 +++++++++----------------
1 file changed, 9 insertions(+), 16 deletions(-)
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -526,8 +526,7 @@ bool unhandled_signal(struct task_struct
return !tsk->ptrace;
}
-static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *info,
- bool *resched_timer)
+static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *info)
{
struct sigqueue *q, *first = NULL;
@@ -549,12 +548,6 @@ static void collect_signal(int sig, stru
still_pending:
list_del_init(&first->list);
copy_siginfo(info, &first->info);
-
- *resched_timer =
- (first->flags & SIGQUEUE_PREALLOC) &&
- (info->si_code == SI_TIMER) &&
- (info->si_sys_private);
-
__sigqueue_free(first);
} else {
/*
@@ -571,13 +564,12 @@ static void collect_signal(int sig, stru
}
}
-static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
- kernel_siginfo_t *info, bool *resched_timer)
+static int __dequeue_signal(struct sigpending *pending, sigset_t *mask, kernel_siginfo_t *info)
{
int sig = next_signal(pending, mask);
if (sig)
- collect_signal(sig, pending, info, resched_timer);
+ collect_signal(sig, pending, info);
return sig;
}
@@ -589,17 +581,15 @@ static int __dequeue_signal(struct sigpe
int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
{
struct task_struct *tsk = current;
- bool resched_timer = false;
int signr;
lockdep_assert_held(&tsk->sighand->siglock);
*type = PIDTYPE_PID;
- signr = __dequeue_signal(&tsk->pending, mask, info, &resched_timer);
+ signr = __dequeue_signal(&tsk->pending, mask, info);
if (!signr) {
*type = PIDTYPE_TGID;
- signr = __dequeue_signal(&tsk->signal->shared_pending,
- mask, info, &resched_timer);
+ signr = __dequeue_signal(&tsk->signal->shared_pending, mask, info);
if (unlikely(signr == SIGALRM))
posixtimer_rearm_itimer(tsk);
@@ -626,7 +616,7 @@ int dequeue_signal(sigset_t *mask, kerne
}
if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
- if (unlikely(resched_timer))
+ if (unlikely(info->si_code == SI_TIMER && info->si_sys_private))
posixtimer_rearm(info);
}
@@ -1011,6 +1001,9 @@ static int __send_signal_locked(int sig,
lockdep_assert_held(&t->sighand->siglock);
+ if (WARN_ON_ONCE(!is_si_special(info) && info->si_code == SI_TIMER))
+ return 0;
+
result = TRACE_SIGNAL_IGNORED;
if (!prepare_signal(sig, t, force))
goto ret;
On 04/11, Thomas Gleixner wrote:
>
> There is no reason for handing the *resched pointer argument through
> several functions just to check whether the signal is related to a self
> rearming posix timer.
Agreed, these changes looks good to me.
But,
> SI_TIMER is only used by the posix timer code and cannot be queued from
> user space.
Why? I think sigqueueinfo() can certainly use si_code = SI_TIMER, so
> @@ -1011,6 +1001,9 @@ static int __send_signal_locked(int sig,
>
> lockdep_assert_held(&t->sighand->siglock);
>
> + if (WARN_ON_ONCE(!is_si_special(info) && info->si_code == SI_TIMER))
> + return 0;
this can be easily triggered by userspace and thus looks wrong.
Oleg.
On 04/18, Oleg Nesterov wrote:
>
> On 04/11, Thomas Gleixner wrote:
> >
> > There is no reason for handing the *resched pointer argument through
> > several functions just to check whether the signal is related to a self
> > rearming posix timer.
>
> Agreed, these changes looks good to me.
I meant the intent.
But this is not simple, collect_signal() checks SIGQUEUE_PREALLOC exactly
because (iiuc) we need to ensure that SI_TIMER didn't come from userspace.
perhaps we should disallow SI_TIMER with _sys_private != 0 from userspace,
I dunno...
And I don't really understand the "not to be passed to user" comment in
include/uapi/asm-generic/siginfo.h. copy_siginfo_to_user() just copies
the whole kernel_siginfo.
Confused.
> But,
>
> > SI_TIMER is only used by the posix timer code and cannot be queued from
> > user space.
>
> Why? I think sigqueueinfo() can certainly use si_code = SI_TIMER, so
>
> > @@ -1011,6 +1001,9 @@ static int __send_signal_locked(int sig,
> >
> > lockdep_assert_held(&t->sighand->siglock);
> >
> > + if (WARN_ON_ONCE(!is_si_special(info) && info->si_code == SI_TIMER))
> > + return 0;
>
> this can be easily triggered by userspace and thus looks wrong.
>
> Oleg.
On 04/18, Oleg Nesterov wrote:
>
> On 04/18, Oleg Nesterov wrote:
> >
> > On 04/11, Thomas Gleixner wrote:
> > >
> > > There is no reason for handing the *resched pointer argument through
> > > several functions just to check whether the signal is related to a self
> > > rearming posix timer.
> >
> > Agreed, these changes looks good to me.
>
> I meant the intent.
>
> But this is not simple, collect_signal() checks SIGQUEUE_PREALLOC exactly
> because (iiuc) we need to ensure that SI_TIMER didn't come from userspace.
>
> perhaps we should disallow SI_TIMER with _sys_private != 0 from userspace,
> I dunno...
>
> And I don't really understand the "not to be passed to user" comment in
> include/uapi/asm-generic/siginfo.h. copy_siginfo_to_user() just copies
> the whole kernel_siginfo.
OK, si_sys_private is cleared in dequeue_signal() (or in posixtimer_rearm()
with this series).
In the past SI_TIMER was defined as __SI_CODE(__SI_TIMER,-2), it was > 0,
so it could not come from userspace (see the info->si_code >= 0 check in
do_rt_sigqueueinfo).
Today SI_TIMER < 0. We could introduce SI_TIMER_KERNEL > 0 with the minimal
changes, but this can't help because the commit 66dd34ad31e59 allows to send
any siginfo to itself.
Otoh, I have no idea how CRIU restores the posix timers. If a process has
a pending blocked SI_TIMER signal, then I guess it actually needs to enqueue
this signal at restore time, but resched_timer will be never true?
I got lost... Sorry for the noise.
> Confused.
>
> > But,
> >
> > > SI_TIMER is only used by the posix timer code and cannot be queued from
> > > user space.
> >
> > Why? I think sigqueueinfo() can certainly use si_code = SI_TIMER, so
> >
> > > @@ -1011,6 +1001,9 @@ static int __send_signal_locked(int sig,
> > >
> > > lockdep_assert_held(&t->sighand->siglock);
> > >
> > > + if (WARN_ON_ONCE(!is_si_special(info) && info->si_code == SI_TIMER))
> > > + return 0;
> >
> > this can be easily triggered by userspace and thus looks wrong.
> >
> > Oleg.
On Fri, Apr 19 2024 at 13:06, Oleg Nesterov wrote:
> On 04/18, Oleg Nesterov wrote:
>> But this is not simple, collect_signal() checks SIGQUEUE_PREALLOC exactly
>> because (iiuc) we need to ensure that SI_TIMER didn't come from userspace.
>>
>> perhaps we should disallow SI_TIMER with _sys_private != 0 from userspace,
>> I dunno...
>>
>> And I don't really understand the "not to be passed to user" comment in
>> include/uapi/asm-generic/siginfo.h. copy_siginfo_to_user() just copies
>> the whole kernel_siginfo.
>
> OK, si_sys_private is cleared in dequeue_signal() (or in posixtimer_rearm()
> with this series).
>
> In the past SI_TIMER was defined as __SI_CODE(__SI_TIMER,-2), it was > 0,
> so it could not come from userspace (see the info->si_code >= 0 check in
> do_rt_sigqueueinfo).
Duh.
> Today SI_TIMER < 0. We could introduce SI_TIMER_KERNEL > 0 with the minimal
> changes, but this can't help because the commit 66dd34ad31e59 allows to send
> any siginfo to itself.
Well that predates the __SI_CODE() removal. So I doubt it's required
today, but what do I know about CRIU.
> Otoh, I have no idea how CRIU restores the posix timers. If a process has
> a pending blocked SI_TIMER signal, then I guess it actually needs to enqueue
> this signal at restore time, but resched_timer will be never true?
It can't restore the correct sys_si_private value because that is
nowhere exposed to user space.
There is no special treatment for SI_TIMER, so the signal restore might
just end up queueing a random extra SI_TIMER signal if there was one
pending.
I checked the CRIU source and it looks like this just "works" by
reconstructing and rearming the timer with the last expiry value. As
that is in the past it will fire immediately and queue the signal.
sys_si_private seems to be excluded from being set from user space in
compat mode, but in non-compat mode I can't find anything which prevents
that.
Let me stare more at this.
Thanks,
tglx
On Tue, Apr 23 2024 at 23:18, Thomas Gleixner wrote:
> On Fri, Apr 19 2024 at 13:06, Oleg Nesterov wrote:
>> Otoh, I have no idea how CRIU restores the posix timers. If a process has
>> a pending blocked SI_TIMER signal, then I guess it actually needs to enqueue
>> this signal at restore time, but resched_timer will be never true?
>
> It can't restore the correct sys_si_private value because that is
> nowhere exposed to user space.
It is exposed via PTRACE_PEEKSIGINFO, but it's useless.
> There is no special treatment for SI_TIMER, so the signal restore might
> just end up queueing a random extra SI_TIMER signal if there was one
> pending.
It does. The sys_si_private value is not going to match the timer side
value and obviously the missing prealloc flag prevents it from trying to
rearm the timer.
> I checked the CRIU source and it looks like this just "works" by
> reconstructing and rearming the timer with the last expiry value. As
> that is in the past it will fire immediately and queue the signal.
It's not necessarily in the past, but it will fire eventually and in the
case of a blocked signal there will be two SI_TIMER signals queued.
So the patch is not completely wrong except that there is nothing which
prevents setting sys_si_private via rt_sigqueueinfo(), but that's
obviously a solvable problem. With that solved the condition:
*resched_timer =
(first->flags & SIGQUEUE_PREALLOC) &&
(info->si_code == SI_TIMER) &&
(info->si_sys_private);
really can be reduced to:
info->code == SI_TIMER && info->si_sys_private
In fact it makes a lot of sense _not_ to allow user space to set
info->si_sys_private because that's a kernel internal value and should
never be exposed to user space in the first place.
Let me look what it needs or if this can be solved slightly differently.
On Tue, Apr 23, 2024 at 6:48 PM Thomas Gleixner <[email protected]> wrote:
>
> On Tue, Apr 23 2024 at 23:18, Thomas Gleixner wrote:
> > On Fri, Apr 19 2024 at 13:06, Oleg Nesterov wrote:
> >> Today SI_TIMER < 0. We could introduce SI_TIMER_KERNEL > 0 with the minimal
> >> changes, but this can't help because the commit 66dd34ad31e59 allows to send
> >> any siginfo to itself.
>
> Well that predates the __SI_CODE() removal. So I doubt it's required
> today, but what do I know about CRIU.
CRIU needs to restore siginfo-s of pending signals so that a task sees
the same siginfo in a signal handler as it would be without
checkpoint/restore. CRIU will not be affected, if rt_sigqueueinfo denies
any non-negative si_code that is never reported by the kernel.
> >> Otoh, I have no idea how CRIU restores the posix timers. If a process has
> >> a pending blocked SI_TIMER signal, then I guess it actually needs to enqueue
> >> this signal at restore time, but resched_timer will be never true?
> >
> > It can't restore the correct sys_si_private value because that is
> > nowhere exposed to user space.
We are open to ideas how it can be restored properly.
>
> It is exposed via PTRACE_PEEKSIGINFO, but it's useless.
When PTRACE_PEEKSIGINFO was added, it didn't expose it. Then it was
changed by cc731525f26a ("signal: Remove kernel internal si_code magic").
The idea of PTRACE_PEEKSIGINFO is to get a siginfo that a process would
see in a signal handler.
>
> > There is no special treatment for SI_TIMER, so the signal restore might
> > just end up queueing a random extra SI_TIMER signal if there was one
> > pending.
>
> It does. The sys_si_private value is not going to match the timer side
> value and obviously the missing prealloc flag prevents it from trying to
> rearm the timer.
>
> > I checked the CRIU source and it looks like this just "works" by
> > reconstructing and rearming the timer with the last expiry value. As
> > that is in the past it will fire immediately and queue the signal.
>
> It's not necessarily in the past, but it will fire eventually and in the
> case of a blocked signal there will be two SI_TIMER signals queued.
>
> So the patch is not completely wrong except that there is nothing which
> prevents setting sys_si_private via rt_sigqueueinfo(), but that's
> obviously a solvable problem. With that solved the condition:
>
> *resched_timer =
> (first->flags & SIGQUEUE_PREALLOC) &&
> (info->si_code == SI_TIMER) &&
> (info->si_sys_private);
>
> really can be reduced to:
>
> info->code == SI_TIMER && info->si_sys_private
>
> In fact it makes a lot of sense _not_ to allow user space to set
> info->si_sys_private because that's a kernel internal value and should
> never be exposed to user space in the first place.
We can zero out all of them in rt_sigqueueinfo.
Thanks,
Andrei