2007-01-25 16:22:44

by Oleg Nesterov

[permalink] [raw]
Subject: Re: + aio-completion-signal-notification.patch added to -mm tree

Sebastien Dugue wrote:
>
> +static long aio_setup_sigevent(struct aio_notify *notify,
> + struct sigevent __user *user_event)
> +{
> + sigevent_t event;
> + struct task_struct *target;
> +
> + if (copy_from_user(&event, user_event, sizeof(event)))
> + return -EFAULT;
> +
> + if (event.sigev_notify == SIGEV_NONE)
> + return 0;
> +
> + notify->notify = event.sigev_notify;
> + notify->signo = event.sigev_signo;
> + notify->value = event.sigev_value;
> +
> + read_lock(&tasklist_lock);

We don't need tasklist_lock, we can use rcu_read_lock() instead.

> + target = good_sigevent(&event);
> +
> + if (unlikely(!target || (target->flags & PF_EXITING)))
> + goto out_unlock;

PF_EXITING check is racy and unneded. In fact, it is wrong. If the main
thread is already died, we can only use SIGEV_THREAD_ID signals, because
otherwise good_sigevent() returns ->group_leader.

> @@ -994,6 +1077,15 @@ int fastcall aio_complete(struct kiocb *
> kunmap_atomic(ring, KM_IRQ1);
>
> pr_debug("added to ring %p at [%lu]\n", iocb, tail);
> +
> + if (iocb->ki_notify.notify != SIGEV_NONE) {
> + ret = aio_send_signal(&iocb->ki_notify);
> +
> + /* If signal generation failed, release the sigqueue */
> + if (ret)
> + sigqueue_free(iocb->ki_notify.sigq);

We should not use sigqueue_free() here. It takes current->sighand->siglock
to remove sigqueue from "struct sigpending". But current is just a "random"
process here.

Yes, if I understand this patch correctly, it is not possible that this
sigqueue is pending, but still this is bad imho.

> static void __sigqueue_free(struct sigqueue *q)
> {
> - if (q->flags & SIGQUEUE_PREALLOC)
> + if (q->flags & SIGQUEUE_PREALLOC && q->info.si_code != SI_ASYNCIO)
> return;

Oh, this is not nice. Could we change send_sigqueue/send_group_sigqueue
instead ?

- BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
+ BUG_ON(!(q->flags & SIGQUEUE_PREALLOC) && q->info.si_code != SI_ASYNCIO);

This way aio can use __sigqueue_alloc/__sigqueue_free directly and forget
about SIGQUEUE_PREALLOC.

On the other hand, imho this patch takes a wrong direction.

The purpose of SIGQUEUE_PREALLOC + send_sigqueue() is to re-use the same
sigqueue while sending a stream of signals. But in aio case we allocate
sigqueue to send only 1 signal, then it freed after the delivery like
the regular sigqueue. So what is the point?

I'd suggest to not use this interface. Just use group_send_sig_info() or
specific_send_sig_info(). Yes, this way we will do GFP_ATOMIC allocation
of sigqueue in interrupt context, but is this so bad in this case?

Oleg.


2007-01-25 16:34:59

by Oleg Nesterov

[permalink] [raw]
Subject: Re: + aio-completion-signal-notification.patch added to -mm tree

On 01/25, Oleg Nesterov wrote:
>
> Sebastien Dugue wrote:
> >
> > + if (iocb->ki_notify.notify != SIGEV_NONE) {
> > + ret = aio_send_signal(&iocb->ki_notify);
> > +
> > + /* If signal generation failed, release the sigqueue */
> > + if (ret)
> > + sigqueue_free(iocb->ki_notify.sigq);
>
> We should not use sigqueue_free() here. It takes current->sighand->siglock
> to remove sigqueue from "struct sigpending". But current is just a "random"
> process here.

... and it is possible that current->sighand == NULL.

Oleg.

2007-01-26 11:15:13

by Sébastien Dugué

[permalink] [raw]
Subject: Re: + aio-completion-signal-notification.patch added to -mm tree

On Thu, 25 Jan 2007 19:21:41 +0300 Oleg Nesterov <[email protected]> wrote:

> Sebastien Dugue wrote:
> >
> > +static long aio_setup_sigevent(struct aio_notify *notify,
> > + struct sigevent __user *user_event)
> > +{
> > + sigevent_t event;
> > + struct task_struct *target;
> > +
> > + if (copy_from_user(&event, user_event, sizeof(event)))
> > + return -EFAULT;
> > +
> > + if (event.sigev_notify == SIGEV_NONE)
> > + return 0;
> > +
> > + notify->notify = event.sigev_notify;
> > + notify->signo = event.sigev_signo;
> > + notify->value = event.sigev_value;
> > +
> > + read_lock(&tasklist_lock);
>
> We don't need tasklist_lock, we can use rcu_read_lock() instead.

Right that will be beneficial, will change.

>
> > + target = good_sigevent(&event);
> > +
> > + if (unlikely(!target || (target->flags & PF_EXITING)))
> > + goto out_unlock;
>
> PF_EXITING check is racy and unneded. In fact, it is wrong. If the main
> thread is already died, we can only use SIGEV_THREAD_ID signals, because
> otherwise good_sigevent() returns ->group_leader.

Care to explain here please, I'm not following you.

>
> > @@ -994,6 +1077,15 @@ int fastcall aio_complete(struct kiocb *
> > kunmap_atomic(ring, KM_IRQ1);
> >
> > pr_debug("added to ring %p at [%lu]\n", iocb, tail);
> > +
> > + if (iocb->ki_notify.notify != SIGEV_NONE) {
> > + ret = aio_send_signal(&iocb->ki_notify);
> > +
> > + /* If signal generation failed, release the sigqueue */
> > + if (ret)
> > + sigqueue_free(iocb->ki_notify.sigq);
>
> We should not use sigqueue_free() here. It takes current->sighand->siglock
> to remove sigqueue from "struct sigpending". But current is just a "random"
> process here.
>
> Yes, if I understand this patch correctly, it is not possible that this
> sigqueue is pending, but still this is bad imho.

Yes, in fact the sigqueue is used for a single signal delivery and then
free. In fact I could have used directly __sigqueue_free() instead here
except for the fact that it's private to signal.c and I'm reluctant
to export it to other subsystems.

>
> > static void __sigqueue_free(struct sigqueue *q)
> > {
> > - if (q->flags & SIGQUEUE_PREALLOC)
> > + if (q->flags & SIGQUEUE_PREALLOC && q->info.si_code != SI_ASYNCIO)
> > return;
>
> Oh, this is not nice. Could we change send_sigqueue/send_group_sigqueue
> instead ?

Yep, that's the other solution.

>
> - BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
> + BUG_ON(!(q->flags & SIGQUEUE_PREALLOC) && q->info.si_code != SI_ASYNCIO);
>
> This way aio can use __sigqueue_alloc/__sigqueue_free directly and forget
> about SIGQUEUE_PREALLOC.

Well, I don't think it's cleaner. The aio error path calls sigqueue_free()
directly whereas in case of success sigqueue_free() is called from the signal
delivery path.

>
> On the other hand, imho this patch takes a wrong direction.
>
> The purpose of SIGQUEUE_PREALLOC + send_sigqueue() is to re-use the same
> sigqueue while sending a stream of signals. But in aio case we allocate
> sigqueue to send only 1 signal, then it freed after the delivery like
> the regular sigqueue. So what is the point?
>
> I'd suggest to not use this interface. Just use group_send_sig_info() or
> specific_send_sig_info(). Yes, this way we will do GFP_ATOMIC allocation
> of sigqueue in interrupt context, but is this so bad in this case?

Well, the thihere is that in the past we used group_send_sig_info()
and specific_send_sig_info() for notification but Zach Brown raised
the question about reliable signal delivery. IOW an aio submission
should not succeed if signal delivery is going to fail. Hence the
use of the preallocated sigqueue.

Thanks,

S?bastien.





2007-01-26 11:53:10

by Oleg Nesterov

[permalink] [raw]
Subject: Re: + aio-completion-signal-notification.patch added to -mm tree

On 01/26, S?bastien Dugu? wrote:
>
> On Thu, 25 Jan 2007 19:21:41 +0300 Oleg Nesterov <[email protected]> wrote:
>
> > > + target = good_sigevent(&event);
> > > +
> > > + if (unlikely(!target || (target->flags & PF_EXITING)))
> > > + goto out_unlock;
> >
> > PF_EXITING check is racy and unneded. In fact, it is wrong. If the main
> > thread is already died, we can only use SIGEV_THREAD_ID signals, because
> > otherwise good_sigevent() returns ->group_leader.
>
> Care to explain here please, I'm not following you.

My apologies, I was unclear.

This check is racy, the condition could be changed right after the check.

It is unneeded, it is ok to do send_sigqueue(tsk) if if that task is already
dead. (we hold the reference to task_struct).

Now suppose that the main thread (->group_leader) already exited. This is
normal, the thread group is still alive, it should be ok to send a signal to
it via send_group_sigqueue(). But we can't: without SIGEV_THREAD_ID in
->sigev_notify good_event() returns ->group_leader, and it has PF_EXITING.

Yes, kernel/posix-timers.c needs a cleanup too. But please note that it does
this check for another reason (according to the comment). This reason is not
valid now, the callsite for exit_itimers() was moved from __exit_signal() to
do_exit().

> > > + if (iocb->ki_notify.notify != SIGEV_NONE) {
> > > + ret = aio_send_signal(&iocb->ki_notify);
> > > +
> > > + /* If signal generation failed, release the sigqueue */
> > > + if (ret)
> > > + sigqueue_free(iocb->ki_notify.sigq);
> >
> > We should not use sigqueue_free() here. It takes current->sighand->siglock
> > to remove sigqueue from "struct sigpending". But current is just a "random"
> > process here.
> >
> > Yes, if I understand this patch correctly, it is not possible that this
> > sigqueue is pending, but still this is bad imho.
>
> Yes, in fact the sigqueue is used for a single signal delivery and then
> free. In fact I could have used directly __sigqueue_free() instead here
> except for the fact that it's private to signal.c and I'm reluctant
> to export it to other subsystems.

I personally think it is better to export __sigqueue_free() even if sigqueue_free()
happens to work. It is to fragile imho to reference current->sighand. At least
we need a fat comment.

> > > static void __sigqueue_free(struct sigqueue *q)
> > > {
> > > - if (q->flags & SIGQUEUE_PREALLOC)
> > > + if (q->flags & SIGQUEUE_PREALLOC && q->info.si_code != SI_ASYNCIO)
> > > return;
> >
> > Oh, this is not nice. Could we change send_sigqueue/send_group_sigqueue
> > instead ?
>
> Yep, that's the other solution.
>
> >
> > - BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
> > + BUG_ON(!(q->flags & SIGQUEUE_PREALLOC) && q->info.si_code != SI_ASYNCIO);
> >
> > This way aio can use __sigqueue_alloc/__sigqueue_free directly and forget
> > about SIGQUEUE_PREALLOC.
>
> Well, I don't think it's cleaner. The aio error path calls sigqueue_free()
> directly whereas in case of success sigqueue_free() is called from the signal
> delivery path.

Hmm... now I don't understand you. Of course, the aio error path should use
__sigqueue_free() if we don't use SIGQUEUE_PREALLOC (and imho we should not).

And the signal delivery path uses __sigqueue_free() too.

?

> > I'd suggest to not use this interface. Just use group_send_sig_info() or
> > specific_send_sig_info(). Yes, this way we will do GFP_ATOMIC allocation
> > of sigqueue in interrupt context, but is this so bad in this case?
>
> Well, the thihere is that in the past we used group_send_sig_info()
> and specific_send_sig_info() for notification but Zach Brown raised
> the question about reliable signal delivery. IOW an aio submission
> should not succeed if signal delivery is going to fail. Hence the
> use of the preallocated sigqueue.

Ok, I see, thanks.

Oleg.

2007-01-26 13:06:52

by Sébastien Dugué

[permalink] [raw]
Subject: Re: + aio-completion-signal-notification.patch added to -mm tree

On Fri, 26 Jan 2007 14:52:33 +0300 Oleg Nesterov <[email protected]> wrote:

> On 01/26, S?bastien Dugu? wrote:
> >
> > On Thu, 25 Jan 2007 19:21:41 +0300 Oleg Nesterov <[email protected]> wrote:
> >
> > > > + target = good_sigevent(&event);
> > > > +
> > > > + if (unlikely(!target || (target->flags & PF_EXITING)))
> > > > + goto out_unlock;
> > >
> > > PF_EXITING check is racy and unneded. In fact, it is wrong. If the main
> > > thread is already died, we can only use SIGEV_THREAD_ID signals, because
> > > otherwise good_sigevent() returns ->group_leader.
> >
> > Care to explain here please, I'm not following you.
>
> My apologies, I was unclear.
>
> This check is racy, the condition could be changed right after the check.
>
> It is unneeded, it is ok to do send_sigqueue(tsk) if if that task is already
> dead. (we hold the reference to task_struct).
>
> Now suppose that the main thread (->group_leader) already exited. This is
> normal, the thread group is still alive, it should be ok to send a signal to
> it via send_group_sigqueue(). But we can't: without SIGEV_THREAD_ID in
> ->sigev_notify good_event() returns ->group_leader, and it has PF_EXITING.

Thanks, I understand the problem now. I will fix this.

>
> Yes, kernel/posix-timers.c needs a cleanup too. But please note that it does
> this check for another reason (according to the comment). This reason is not
> valid now, the callsite for exit_itimers() was moved from __exit_signal() to
> do_exit().
>
> > > > + if (iocb->ki_notify.notify != SIGEV_NONE) {
> > > > + ret = aio_send_signal(&iocb->ki_notify);
> > > > +
> > > > + /* If signal generation failed, release the sigqueue */
> > > > + if (ret)
> > > > + sigqueue_free(iocb->ki_notify.sigq);
> > >
> > > We should not use sigqueue_free() here. It takes current->sighand->siglock
> > > to remove sigqueue from "struct sigpending". But current is just a "random"
> > > process here.
> > >
> > > Yes, if I understand this patch correctly, it is not possible that this
> > > sigqueue is pending, but still this is bad imho.
> >
> > Yes, in fact the sigqueue is used for a single signal delivery and then
> > free. In fact I could have used directly __sigqueue_free() instead here
> > except for the fact that it's private to signal.c and I'm reluctant
> > to export it to other subsystems.
>
> I personally think it is better to export __sigqueue_free() even if sigqueue_free()
> happens to work. It is to fragile imho to reference current->sighand. At least
> we need a fat comment.

OK.

>
> > > > static void __sigqueue_free(struct sigqueue *q)
> > > > {
> > > > - if (q->flags & SIGQUEUE_PREALLOC)
> > > > + if (q->flags & SIGQUEUE_PREALLOC && q->info.si_code != SI_ASYNCIO)
> > > > return;
> > >
> > > Oh, this is not nice. Could we change send_sigqueue/send_group_sigqueue
> > > instead ?
> >
> > Yep, that's the other solution.
> >
> > >
> > > - BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
> > > + BUG_ON(!(q->flags & SIGQUEUE_PREALLOC) && q->info.si_code != SI_ASYNCIO);
> > >
> > > This way aio can use __sigqueue_alloc/__sigqueue_free directly and forget
> > > about SIGQUEUE_PREALLOC.
> >
> > Well, I don't think it's cleaner. The aio error path calls sigqueue_free()
> > directly whereas in case of success sigqueue_free() is called from the signal
> > delivery path.
>
> Hmm... now I don't understand you. Of course, the aio error path should use
> __sigqueue_free() if we don't use SIGQUEUE_PREALLOC (and imho we should not).
>
> And the signal delivery path uses __sigqueue_free() too.
>
> ?
>
> > > I'd suggest to not use this interface. Just use group_send_sig_info() or
> > > specific_send_sig_info(). Yes, this way we will do GFP_ATOMIC allocation
> > > of sigqueue in interrupt context, but is this so bad in this case?
> >
> > Well, the thihere is that in the past we used group_send_sig_info()
> > and specific_send_sig_info() for notification but Zach Brown raised
> > the question about reliable signal delivery. IOW an aio submission
> > should not succeed if signal delivery is going to fail. Hence the
> > use of the preallocated sigqueue.
>
> Ok, I see, thanks.
>
> Oleg.
>