2006-11-09 15:55:30

by Sébastien Dugué

[permalink] [raw]
Subject: [PATCH -mm 0/3][AIO] - AIO completion signal notification


Hi

Here is the latest rework of the AIO completion signal notification patches.

This set consists in 3 patches:

1. aio-header-fix-includes: fixes the double inclusion of uio.h in aio.h

2. export-good_sigevent: move good_sigevent into signal.c and export it

3. aio-notify-sig: the AIO completion signal notification

Description are in the individual patches.

Comments are welcome as usual.

Thanks,

S?bastien.


2006-11-09 15:59:23

by Sébastien Dugué

[permalink] [raw]
Subject: [PATCH -mm 3/3][AIO] - AIO completion signal notification


AIO completion signal notification

The current 2.6 kernel does not support notification of user space via
an RT signal upon an asynchronous IO completion. The POSIX specification
states that when an AIO request completes, a signal can be delivered to
the application as notification.

This patch adds a struct sigevent *aio_sigeventp to the iocb.
The relevant fields (pid, signal number and value) are stored in the kiocb
for use when the request completes.

That sigevent structure is filled by the application as part of the AIO
request preparation. Upon request completion, the kernel notifies the
application using those sigevent parameters. If SIGEV_NONE has been specified,
then the old behaviour is retained and the application must rely on polling
the completion queue using io_getevents().


A struct sigevent *aio_sigeventp field is added to struct iocb in
include/linux/aio_abi.h

A struct aio_notify containing the sigevent parameters is defined in aio.h:

struct aio_notify {
struct task_struct *target;
__u16 signo;
__u16 notify;
sigval_t value;
};

A struct aio_notify ki_notify is added to struct kiocb in include/linux/aio.h

In io_submit_one(), if the application provided a sigevent then
setup_sigevent() is called which does the following:

- check access to the user sigevent and make a local copy

- if the requested notification is SIGEV_NONE, then nothing to do

- fill in the kiocb->ki_notify fields (notify, signo, value)

- check sigevent consistency, get the signal target task and
save it in kiocb->ki_notify.target

- preallocate a sigqueue for this event using sigqueue_alloc()

Upon request completion, in aio_complete(), if notification is needed for
this request (iocb->ki_notify.notify != SIGEV_NONE), then aio_send_signal()
is called to signal the target task as follows:

- fill in the siginfo struct to be sent to the task

- if notify is SIGEV_THREAD_ID then send signal to specific task
using send_sigqueue()

- else send signal to task group using send_5group_sigqueue()



Notes concerning sigqueue preallocation:

To ensure reliable delivery of completion notification, the sigqueue is
preallocated in the submission path so that there is no chance it can fail
in the completion path.

Unlike the posix-timers case (currently the single other user of sigqueue
preallocation), where the sigqueue is allocated for the lifetime of the timer
and freed at timer destruction time, the aio case is a bit more tricky due to
the async nature of the whole thing.

In the aio case, the sigqueue exists for the lifetime of the request, therefore
it must be freed only once the signal for the request completion has been
delivered. This involves changing __sigqueue_free() to free the sigqueue when the
signal is collected if si_code is SI_ASYNCIO even if it was preallocated as well
as explicitly calling sigqueue_free() in submission and completion error paths.


fs/aio.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++--
include/linux/aio.h | 12 ++++
include/linux/aio_abi.h | 3 -
kernel/signal.c | 2
4 files changed, 141 insertions(+), 5 deletions(-)


Signed-off-by: S?bastien Dugu? <[email protected]>
Signed-off-by: Laurent Vivier <[email protected]>

Index: linux-2.6.19-rc5-mm1/fs/aio.c
===================================================================
--- linux-2.6.19-rc5-mm1.orig/fs/aio.c 2006-11-09 10:43:57.000000000 +0100
+++ linux-2.6.19-rc5-mm1/fs/aio.c 2006-11-09 14:34:07.000000000 +0100
@@ -416,6 +416,7 @@ static struct kiocb fastcall *__aio_get_
req->ki_dtor = NULL;
req->private = NULL;
req->ki_iovec = NULL;
+ req->ki_sigq = NULL;
INIT_LIST_HEAD(&req->ki_run_list);

/* Check if the completion queue has enough free space to
@@ -463,6 +464,11 @@ static inline void really_put_req(struct
req->ki_dtor(req);
if (req->ki_iovec != &req->ki_inline_vec)
kfree(req->ki_iovec);
+
+ /* Release task ref */
+ if (req->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
+ put_task_struct(req->ki_notify.target);
+
kmem_cache_free(kiocb_cachep, req);
ctx->reqs_active--;

@@ -929,6 +935,97 @@ void fastcall kick_iocb(struct kiocb *io
}
EXPORT_SYMBOL(kick_iocb);

+static int aio_send_signal(struct kiocb *iocb)
+{
+ int ret;
+
+ memset(&iocb->ki_sigq->info, 0, sizeof(struct siginfo));
+
+ iocb->ki_sigq->info.si_signo = iocb->ki_notify.signo;
+ iocb->ki_sigq->info.si_errno = 0;
+ iocb->ki_sigq->info.si_code = SI_ASYNCIO;
+ iocb->ki_sigq->info.si_pid = 0;
+ iocb->ki_sigq->info.si_uid = 0;
+ iocb->ki_sigq->info.si_value = iocb->ki_notify.value;
+
+ if (iocb->ki_notify.notify & SIGEV_THREAD_ID)
+ ret = send_sigqueue(iocb->ki_notify.signo, iocb->ki_sigq,
+ iocb->ki_notify.target);
+ else
+ ret = send_group_sigqueue(iocb->ki_notify.signo, iocb->ki_sigq,
+ iocb->ki_notify.target);
+
+ return ret;
+}
+
+static long aio_setup_sigevent(struct kiocb *iocb,
+ struct sigevent __user *user_event)
+{
+ int error = 0;
+ sigevent_t event;
+ struct task_struct *target;
+ unsigned long flags;
+
+ if (!access_ok(VERIFY_READ, user_event, sizeof(struct sigevent)))
+ return -EFAULT;
+
+ if (copy_from_user(&event, user_event, sizeof (event)))
+ return -EFAULT;
+
+ /* Check for the SIGEV_NONE case */
+ if (event.sigev_notify == SIGEV_NONE)
+ return 0;
+
+ /* Setup the request completion notification parameters */
+ iocb->ki_notify.notify = event.sigev_notify;
+ iocb->ki_notify.signo = event.sigev_signo;
+ iocb->ki_notify.value = event.sigev_value;
+
+ /* Now get the notification target */
+ read_lock(&tasklist_lock);
+
+ if ((target = good_sigevent(&event))) {
+
+ spin_lock_irqsave(&target->sighand->siglock, flags);
+
+ if (!(target->flags & PF_EXITING)) {
+ iocb->ki_notify.target = target;
+
+ spin_unlock_irqrestore(&target->sighand->siglock, flags);
+
+ /*
+ * Get a ref on the task. It is dropped in really_put_req()
+ * when we're done with the iocb, be it from the normal
+ * completion path, the cancellation path or an error path.
+ */
+ if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
+ get_task_struct(target);
+ } else {
+ spin_unlock_irqrestore(&target->sighand->siglock, flags);
+ error = -EINVAL;
+ }
+ } else
+ error = -EINVAL;
+
+ read_unlock(&tasklist_lock);
+
+ if (!error) {
+ /*
+ * Alloc a sigqueue for this request
+ *
+ * NOTE: we cannot free the sigqueue in the completion path as
+ * the signal may not have been delivered to the target task.
+ * Therefore it has to be freed in __sigqueue_free() when the
+ * signal is collected if si_code is SI_ASYNCIO.
+ */
+ if (unlikely(!(iocb->ki_sigq = sigqueue_alloc())))
+ error = -EAGAIN;
+ }
+
+
+ return error;
+}
+
/* aio_complete
* Called when the io request on the given iocb is complete.
* Returns true if this is the last user of the request. The
@@ -976,8 +1073,11 @@ int fastcall aio_complete(struct kiocb *
* cancelled requests don't get events, userland was given one
* when the event got cancelled.
*/
- if (kiocbIsCancelled(iocb))
+ if (kiocbIsCancelled(iocb)) {
+ if (iocb->ki_sigq)
+ sigqueue_free(iocb->ki_sigq);
goto put_rq;
+ }

ring = kmap_atomic(info->ring_pages[0], KM_IRQ1);

@@ -1008,6 +1108,14 @@ int fastcall aio_complete(struct kiocb *

pr_debug("added to ring %p at [%lu]\n", iocb, tail);

+ if (iocb->ki_notify.notify != SIGEV_NONE) {
+ ret = aio_send_signal(iocb);
+
+ /* If signal generation failed, release the sigqueue */
+ if (ret)
+ sigqueue_free(iocb->ki_sigq);
+ }
+
pr_debug("%ld retries: %zd of %zd\n", iocb->ki_retried,
iocb->ki_nbytes - iocb->ki_left, iocb->ki_nbytes);
put_rq:
@@ -1549,8 +1657,7 @@ int fastcall io_submit_one(struct kioctx
ssize_t ret;

/* enforce forwards compatibility on users */
- if (unlikely(iocb->aio_reserved1 || iocb->aio_reserved2 ||
- iocb->aio_reserved3)) {
+ if (unlikely(iocb->aio_reserved1)) {
pr_debug("EINVAL: io_submit: reserve field set\n");
return -EINVAL;
}
@@ -1559,6 +1666,7 @@ int fastcall io_submit_one(struct kioctx
if (unlikely(
(iocb->aio_buf != (unsigned long)iocb->aio_buf) ||
(iocb->aio_nbytes != (size_t)iocb->aio_nbytes) ||
+ (iocb->aio_sigeventp != (unsigned long)iocb->aio_sigeventp) ||
((ssize_t)iocb->aio_nbytes < 0)
)) {
pr_debug("EINVAL: io_submit: overflow check\n");
@@ -1593,6 +1701,17 @@ int fastcall io_submit_one(struct kioctx
INIT_LIST_HEAD(&req->ki_wait.task_list);
req->ki_retried = 0;

+ /* handle setting up the sigevent for POSIX AIO signals */
+ req->ki_notify.notify = SIGEV_NONE;
+
+ if (iocb->aio_sigeventp) {
+ ret = aio_setup_sigevent(req,
+ (struct sigevent __user *)(unsigned long)
+ iocb->aio_sigeventp);
+ if (ret)
+ goto out_put_req;
+ }
+
ret = aio_setup_iocb(req);

if (ret)
@@ -1610,6 +1729,10 @@ int fastcall io_submit_one(struct kioctx
return 0;

out_put_req:
+ /* Undo the sigqueue alloc if someting went bad */
+ if (req->ki_sigq)
+ sigqueue_free(req->ki_sigq);
+
aio_put_req(req); /* drop extra ref to req */
aio_put_req(req); /* drop i/o ref to req */
return ret;
Index: linux-2.6.19-rc5-mm1/include/linux/aio_abi.h
===================================================================
--- linux-2.6.19-rc5-mm1.orig/include/linux/aio_abi.h 2006-11-09 10:43:57.000000000 +0100
+++ linux-2.6.19-rc5-mm1/include/linux/aio_abi.h 2006-11-09 10:46:25.000000000 +0100
@@ -82,8 +82,9 @@ struct iocb {
__u64 aio_nbytes;
__s64 aio_offset;

+ __u64 aio_sigeventp; /* pointer to struct sigevent */
+
/* extra parameters */
- __u64 aio_reserved2; /* TODO: use this for a (struct sigevent *) */
__u64 aio_reserved3;
}; /* 64 bytes */

Index: linux-2.6.19-rc5-mm1/include/linux/aio.h
===================================================================
--- linux-2.6.19-rc5-mm1.orig/include/linux/aio.h 2006-11-09 10:46:18.000000000 +0100
+++ linux-2.6.19-rc5-mm1/include/linux/aio.h 2006-11-09 14:34:09.000000000 +0100
@@ -7,6 +7,7 @@
#include <linux/uio.h>

#include <asm/atomic.h>
+#include <asm/siginfo.h>

#define AIO_MAXSEGS 4
#define AIO_KIOGRP_NR_ATOMIC 8
@@ -49,6 +50,13 @@ struct kioctx;
#define kiocbIsKicked(iocb) test_bit(KIF_KICKED, &(iocb)->ki_flags)
#define kiocbIsCancelled(iocb) test_bit(KIF_CANCELLED, &(iocb)->ki_flags)

+struct aio_notify {
+ struct task_struct *target;
+ __u16 signo;
+ __u16 notify;
+ sigval_t value;
+};
+
/* is there a better place to document function pointer methods? */
/**
* ki_retry - iocb forward progress callback
@@ -118,6 +126,10 @@ struct kiocb {

struct list_head ki_list; /* the aio core uses this
* for cancellation */
+
+ /* to notify a process on I/O event */
+ struct aio_notify ki_notify;
+ struct sigqueue *ki_sigq;
};

#define is_sync_kiocb(iocb) ((iocb)->ki_key == KIOCB_SYNC_KEY)
Index: linux-2.6.19-rc5-mm1/kernel/signal.c
===================================================================
--- linux-2.6.19-rc5-mm1.orig/kernel/signal.c 2006-11-09 10:46:23.000000000 +0100
+++ linux-2.6.19-rc5-mm1/kernel/signal.c 2006-11-09 10:46:25.000000000 +0100
@@ -297,7 +297,7 @@ static struct sigqueue *__sigqueue_alloc

static void __sigqueue_free(struct sigqueue *q)
{
- if (q->flags & SIGQUEUE_PREALLOC)
+ if (q->flags & SIGQUEUE_PREALLOC && q->info.si_code != SI_ASYNCIO)
return;
atomic_dec(&q->user->sigpending);
free_uid(q->user);


2006-11-09 15:59:15

by Sébastien Dugué

[permalink] [raw]
Subject: [PATCH -mm 2/3][AIO] - AIO completion signal notification


Export good_sigevent()


Move good_sigevent() from posix-timers.c to signal.c where it belongs,
and export it so that it can be used by other subsystems.


include/linux/signal.h | 2 ++
kernel/posix-timers.c | 17 -----------------
kernel/signal.c | 23 +++++++++++++++++++++++
3 files changed, 25 insertions(+), 17 deletions(-)

Signed-off-by: S?bastien Dugu? <[email protected]>


Index: linux-2.6.19-rc5-mm1/include/linux/signal.h
===================================================================
--- linux-2.6.19-rc5-mm1.orig/include/linux/signal.h 2006-11-09 10:43:58.000000000 +0100
+++ linux-2.6.19-rc5-mm1/include/linux/signal.h 2006-11-09 10:46:23.000000000 +0100
@@ -241,6 +241,8 @@ extern int sigprocmask(int, sigset_t *,
struct pt_regs;
extern int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka, struct pt_regs *regs, void *cookie);

+extern struct task_struct * good_sigevent(sigevent_t *);
+
#endif /* __KERNEL__ */

#endif /* _LINUX_SIGNAL_H */
Index: linux-2.6.19-rc5-mm1/kernel/posix-timers.c
===================================================================
--- linux-2.6.19-rc5-mm1.orig/kernel/posix-timers.c 2006-11-09 10:43:58.000000000 +0100
+++ linux-2.6.19-rc5-mm1/kernel/posix-timers.c 2006-11-09 10:46:23.000000000 +0100
@@ -367,23 +367,6 @@ static enum hrtimer_restart posix_timer_
return ret;
}

-static struct task_struct * good_sigevent(sigevent_t * event)
-{
- struct task_struct *rtn = current->group_leader;
-
- if ((event->sigev_notify & SIGEV_THREAD_ID ) &&
- (!(rtn = find_task_by_pid(event->sigev_notify_thread_id)) ||
- rtn->tgid != current->tgid ||
- (event->sigev_notify & ~SIGEV_THREAD_ID) != SIGEV_SIGNAL))
- return NULL;
-
- if (((event->sigev_notify & ~SIGEV_THREAD_ID) != SIGEV_NONE) &&
- ((event->sigev_signo <= 0) || (event->sigev_signo > SIGRTMAX)))
- return NULL;
-
- return rtn;
-}
-
void register_posix_clock(const clockid_t clock_id, struct k_clock *new_clock)
{
if ((unsigned) clock_id >= MAX_CLOCKS) {
Index: linux-2.6.19-rc5-mm1/kernel/signal.c
===================================================================
--- linux-2.6.19-rc5-mm1.orig/kernel/signal.c 2006-11-09 10:43:58.000000000 +0100
+++ linux-2.6.19-rc5-mm1/kernel/signal.c 2006-11-09 10:46:23.000000000 +0100
@@ -1189,6 +1189,29 @@ int group_send_sig_info(int sig, struct
return ret;
}

+/***
+ * good_sigevent - check and get target task from a sigevent.
+ * @event: the sigevent to be checked
+ *
+ * This function must be called with tasklist_lock held for reading.
+ */
+struct task_struct * good_sigevent(sigevent_t * event)
+{
+ struct task_struct *rtn = current->group_leader;
+
+ if ((event->sigev_notify & SIGEV_THREAD_ID ) &&
+ (!(rtn = find_task_by_pid(event->sigev_notify_thread_id)) ||
+ rtn->tgid != current->tgid ||
+ (event->sigev_notify & ~SIGEV_THREAD_ID) != SIGEV_SIGNAL))
+ return NULL;
+
+ if (((event->sigev_notify & ~SIGEV_THREAD_ID) != SIGEV_NONE) &&
+ ((event->sigev_signo <= 0) || (event->sigev_signo > SIGRTMAX)))
+ return NULL;
+
+ return rtn;
+}
+
/*
* kill_pgrp_info() sends a signal to a process group: this is what the tty
* control characters do (^C, ^Z etc)


2006-11-09 15:59:00

by Sébastien Dugué

[permalink] [raw]
Subject: [PATCH -mm 1/3][AIO] - AIO completion signal notification


Fix the double inclusion of linux/uio.h in linux/aio.h



aio.h | 1 -
1 file changed, 1 deletion(-)

Signed-off-by: S?bastien Dugu? <[email protected]>

Index: linux-2.6.19-rc5-mm1/include/linux/aio.h
===================================================================
--- linux-2.6.19-rc5-mm1.orig/include/linux/aio.h 2006-11-09 10:43:58.000000000 +0100
+++ linux-2.6.19-rc5-mm1/include/linux/aio.h 2006-11-09 10:46:18.000000000 +0100
@@ -7,7 +7,6 @@
#include <linux/uio.h>

#include <asm/atomic.h>
-#include <linux/uio.h>

#define AIO_MAXSEGS 4
#define AIO_KIOGRP_NR_ATOMIC 8


2006-11-09 18:46:10

by Badari Pulavarty

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

On Thu, 2006-11-09 at 16:59 +0100, Sébastien Dugué wrote:

> +
> +static long aio_setup_sigevent(struct kiocb *iocb,
> + struct sigevent __user *user_event)
> +{
> + int error = 0;
> + sigevent_t event;
> + struct task_struct *target;
> + unsigned long flags;
> +
> + if (!access_ok(VERIFY_READ, user_event, sizeof(struct sigevent)))
> + return -EFAULT;
> +
> + if (copy_from_user(&event, user_event, sizeof (event)))
> + return -EFAULT;
> +
> + /* Check for the SIGEV_NONE case */
> + if (event.sigev_notify == SIGEV_NONE)
> + return 0;
> +
> + /* Setup the request completion notification parameters */
> + iocb->ki_notify.notify = event.sigev_notify;
> + iocb->ki_notify.signo = event.sigev_signo;
> + iocb->ki_notify.value = event.sigev_value;
> +
> + /* Now get the notification target */
> + read_lock(&tasklist_lock);
> +
> + if ((target = good_sigevent(&event))) {
> +
> + spin_lock_irqsave(&target->sighand->siglock, flags);
> +
> + if (!(target->flags & PF_EXITING)) {
> + iocb->ki_notify.target = target;
> +
> + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> +
> + /*
> + * Get a ref on the task. It is dropped in really_put_req()
> + * when we're done with the iocb, be it from the normal
> + * completion path, the cancellation path or an error path.
> + */
> + if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
> + get_task_struct(target);
> + } else {
> + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> + error = -EINVAL;
> + }
> + } else
> + error = -EINVAL;
> +
> + read_unlock(&tasklist_lock);
> +
> + if (!error) {
> + /*
> + * Alloc a sigqueue for this request
> + *
> + * NOTE: we cannot free the sigqueue in the completion path as
> + * the signal may not have been delivered to the target task.
> + * Therefore it has to be freed in __sigqueue_free() when the
> + * signal is collected if si_code is SI_ASYNCIO.
> + */
> + if (unlikely(!(iocb->ki_sigq = sigqueue_alloc())))
> + error = -EAGAIN;

Don't you have to do put_task_struct() here ?

> + }
> +
> +
> + return error;
> +}
> +

Thanks,
Badari

2006-11-09 19:08:50

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

> +static long aio_setup_sigevent(struct kiocb *iocb,
> + struct sigevent __user *user_event)
> +{
> + int error = 0;
> + sigevent_t event;
> + struct task_struct *target;
> + unsigned long flags;
> +
> + if (!access_ok(VERIFY_READ, user_event, sizeof(struct sigevent)))
> + return -EFAULT;
> +
> + if (copy_from_user(&event, user_event, sizeof (event)))
> + return -EFAULT;

> +

> +
> + /* Check for the SIGEV_NONE case */
> + if (event.sigev_notify == SIGEV_NONE)
> + return 0;
> +
> + /* Setup the request completion notification parameters */
> + iocb->ki_notify.notify = event.sigev_notify;
> + iocb->ki_notify.signo = event.sigev_signo;
> + iocb->ki_notify.value = event.sigev_value;
> +
> + /* Now get the notification target */
> + read_lock(&tasklist_lock);
> +
> + if ((target = good_sigevent(&event))) {
> +
> + spin_lock_irqsave(&target->sighand->siglock, flags);
> +
> + if (!(target->flags & PF_EXITING)) {
> + iocb->ki_notify.target = target;
> +
> + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> +

I don't think there's a need to have the siglock for checking PF_EXITING,
and I also don't see why you need it for the assignment to
iocb->ki_notify.target. What is it intended to protect?

> + /*
> + * Get a ref on the task. It is dropped in really_put_req()
> + * when we're done with the iocb, be it from the normal
> + * completion path, the cancellation path or an error path.
> + */
> + if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
> + get_task_struct(target);
> + } else {
> + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> + error = -EINVAL;
> + }
> + } else
> + error = -EINVAL;
> +
> + read_unlock(&tasklist_lock);
> +
> + if (!error) {
> + /*
> + * Alloc a sigqueue for this request
> + *
> + * NOTE: we cannot free the sigqueue in the completion path as
> + * the signal may not have been delivered to the target task.
> + * Therefore it has to be freed in __sigqueue_free() when the
> + * signal is collected if si_code is SI_ASYNCIO.
> + */
> + if (unlikely(!(iocb->ki_sigq = sigqueue_alloc())))
> + error = -EAGAIN;

We leak the task reference here.

All in all aio_setup_sigevent is a big mess and the style is very odd.
It's also overcommented on what's beeing done instead of why.
I've attached an draft on how it should look like below.

> + /* handle setting up the sigevent for POSIX AIO signals */
> + req->ki_notify.notify = SIGEV_NONE;
> +
> + if (iocb->aio_sigeventp) {
> + ret = aio_setup_sigevent(req,
> + (struct sigevent __user *)(unsigned long)
> + iocb->aio_sigeventp);
> + if (ret)
> + goto out_put_req;
> + }
> +
> ret = aio_setup_iocb(req);
>
> if (ret)
> @@ -1610,6 +1729,10 @@ int fastcall io_submit_one(struct kioctx
> return 0;
>
> out_put_req:
> + /* Undo the sigqueue alloc if someting went bad */
> + if (req->ki_sigq)
> + sigqueue_free(req->ki_sigq);
> +

Please put this after a separate label, so only callers after
aio_setup_sigevent try to check ki_sigq.



static long aio_setup_sigevent(struct kiocb *iocb,
struct sigevent __user *user_event)
{
sigevent_t event;
struct task_struct *target;
unsigned long flags;

if (copy_from_user(&event, user_event, sizeof (event)))
return -EFAULT;

if (event.sigev_notify == SIGEV_NONE)
return 0;

iocb->ki_notify.notify = event.sigev_notify;
iocb->ki_notify.signo = event.sigev_signo;
iocb->ki_notify.value = event.sigev_value;

read_lock(&tasklist_lock);
target = good_sigevent(&event);
if (unlikely(!target || (target->flags & PF_EXITING)))
goto out_unlock;
iocb->ki_notify.target = target;

if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID)) {
/*
* This reference will be dropped when we're done with
* the request.
*/
get_task_struct(target);
}
read_unlock(&tasklist_lock);

/*
* NOTE: we cannot free the sigqueue in the completion path as
* the signal may not have been delivered to the target task.
* Therefore it has to be freed in __sigqueue_free() when the
* signal is collected if si_code is SI_ASYNCIO.
*/
iocb->ki_sigq = sigqueue_alloc();
if (unlikely(!iocb->ki_sigq)) {
put_task_struct(target);
return -EAGAIN;
}

return 0;
out_unlock:
read_unlock(&tasklist_lock);
return -EINVAL;
}

2006-11-09 19:19:45

by Badari Pulavarty

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

On Thu, 2006-11-09 at 16:59 +0100, Sébastien Dugué wrote:

> @@ -1549,8 +1657,7 @@ int fastcall io_submit_one(struct kioctx
> ssize_t ret;
>
> /* enforce forwards compatibility on users */
> - if (unlikely(iocb->aio_reserved1 || iocb->aio_reserved2 ||
> - iocb->aio_reserved3)) {
> + if (unlikely(iocb->aio_reserved1)) {
> pr_debug("EINVAL: io_submit: reserve field set\n");
> return -EINVAL;

Is there a reason for not checking "aio_reserved3" ?
You are still not using it. Right ?

Thanks,
Badari

2006-11-09 19:27:12

by Badari Pulavarty

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

On Thu, 2006-11-09 at 19:08 +0000, Christoph Hellwig wrote:

Looks much better :)

>
> static long aio_setup_sigevent(struct kiocb *iocb,
> struct sigevent __user *user_event)
> {
> sigevent_t event;
> struct task_struct *target;
> unsigned long flags;
>
> if (copy_from_user(&event, user_event, sizeof (event)))
> return -EFAULT;
>
> if (event.sigev_notify == SIGEV_NONE)
> return 0;
>
> iocb->ki_notify.notify = event.sigev_notify;

Don't we want to verify to make sure that we are accepting only
SIGEV_SIGNAL or SIGEV_THREAD_ID and return -EINVAL, if some
one passes invalid event ? Like

if ((event.sigev_notify != SIGEV_SIGNAL) &&
(event.sigev_notify != SIGEV_THREAD_ID))
return -EINVAL;

> iocb->ki_notify.signo = event.sigev_signo;
> iocb->ki_notify.value = event.sigev_value;
>
> read_lock(&tasklist_lock);
> target = good_sigevent(&event);
> if (unlikely(!target || (target->flags & PF_EXITING)))
> goto out_unlock;
> iocb->ki_notify.target = target;
>
> if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID)) {
> /*
> * This reference will be dropped when we're done with
> * the request.
> */
> get_task_struct(target);
> }
> read_unlock(&tasklist_lock);
>
> /*
> * NOTE: we cannot free the sigqueue in the completion path as
> * the signal may not have been delivered to the target task.
> * Therefore it has to be freed in __sigqueue_free() when the
> * signal is collected if si_code is SI_ASYNCIO.
> */
> iocb->ki_sigq = sigqueue_alloc();
> if (unlikely(!iocb->ki_sigq)) {
> put_task_struct(target);
> return -EAGAIN;
> }
>
> return 0;
> out_unlock:
> read_unlock(&tasklist_lock);
> return -EINVAL;
> }
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [email protected]. For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"[email protected]">[email protected]</a>

2006-11-10 01:52:36

by Badari Pulavarty

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

S?bastien Dugu? wrote:
> AIO completion signal notification
>
>
> +
> + /* Release task ref */
> + if (req->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
> + put_task_struct(req->ki_notify.target);
> +
Huh ?? I thought user can set SIGEV_SIGNAL or SIGEV_THREAD_ID.
Not both. Isn't it ? Shouldn't this be ?

if ((req->ki_notify.notify == SIGEV_SIGNAL) ||
(req->ki_notify.notify == SIGEV_THREAD_ID))
...

Samething in get_task_struct() also..

Thanks,
Badari

2006-11-10 01:54:51

by Badari Pulavarty

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

Badari Pulavarty wrote:
> S?bastien Dugu? wrote:
>> AIO completion signal notification
>>
>>
>> +
>> + /* Release task ref */
>> + if (req->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
>> + put_task_struct(req->ki_notify.target);
>> +
> Huh ?? I thought user can set SIGEV_SIGNAL or SIGEV_THREAD_ID.
> Not both. Isn't it ? Shouldn't this be ?
>
> if ((req->ki_notify.notify == SIGEV_SIGNAL) ||
> (req->ki_notify.notify == SIGEV_THREAD_ID))
> ...
>
Never mind.. got it.

Thanks,
Badari

2006-11-10 09:05:59

by Sébastien Dugué

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

On Thu, 2006-11-09 at 10:46 -0800, Badari Pulavarty wrote:
> On Thu, 2006-11-09 at 16:59 +0100, S?bastien Dugu? wrote:
>
> > +
> > +static long aio_setup_sigevent(struct kiocb *iocb,
> > + struct sigevent __user *user_event)
> > +{
> > + int error = 0;
> > + sigevent_t event;
> > + struct task_struct *target;
> > + unsigned long flags;
> > +
> > + if (!access_ok(VERIFY_READ, user_event, sizeof(struct sigevent)))
> > + return -EFAULT;
> > +
> > + if (copy_from_user(&event, user_event, sizeof (event)))
> > + return -EFAULT;
> > +
> > + /* Check for the SIGEV_NONE case */
> > + if (event.sigev_notify == SIGEV_NONE)
> > + return 0;
> > +
> > + /* Setup the request completion notification parameters */
> > + iocb->ki_notify.notify = event.sigev_notify;
> > + iocb->ki_notify.signo = event.sigev_signo;
> > + iocb->ki_notify.value = event.sigev_value;
> > +
> > + /* Now get the notification target */
> > + read_lock(&tasklist_lock);
> > +
> > + if ((target = good_sigevent(&event))) {
> > +
> > + spin_lock_irqsave(&target->sighand->siglock, flags);
> > +
> > + if (!(target->flags & PF_EXITING)) {
> > + iocb->ki_notify.target = target;
> > +
> > + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> > +
> > + /*
> > + * Get a ref on the task. It is dropped in really_put_req()
> > + * when we're done with the iocb, be it from the normal
> > + * completion path, the cancellation path or an error path.
> > + */
> > + if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
> > + get_task_struct(target);
> > + } else {
> > + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> > + error = -EINVAL;
> > + }
> > + } else
> > + error = -EINVAL;
> > +
> > + read_unlock(&tasklist_lock);
> > +
> > + if (!error) {
> > + /*
> > + * Alloc a sigqueue for this request
> > + *
> > + * NOTE: we cannot free the sigqueue in the completion path as
> > + * the signal may not have been delivered to the target task.
> > + * Therefore it has to be freed in __sigqueue_free() when the
> > + * signal is collected if si_code is SI_ASYNCIO.
> > + */
> > + if (unlikely(!(iocb->ki_sigq = sigqueue_alloc())))
> > + error = -EAGAIN;
>
> Don't you have to do put_task_struct() here ?

Well, put_task_struct() is called in really_put_req() when we're
freeing the request. This should be OK unless I missed something.

S?bastien.

2006-11-10 09:14:20

by Sébastien Dugué

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

On Thu, 2006-11-09 at 19:08 +0000, Christoph Hellwig wrote:
> > +static long aio_setup_sigevent(struct kiocb *iocb,
> > + struct sigevent __user *user_event)
> > +{
> > + int error = 0;
> > + sigevent_t event;
> > + struct task_struct *target;
> > + unsigned long flags;
> > +
> > + if (!access_ok(VERIFY_READ, user_event, sizeof(struct sigevent)))
> > + return -EFAULT;
> > +
> > + if (copy_from_user(&event, user_event, sizeof (event)))
> > + return -EFAULT;
>
> > +
>
> > +
> > + /* Check for the SIGEV_NONE case */
> > + if (event.sigev_notify == SIGEV_NONE)
> > + return 0;
> > +
> > + /* Setup the request completion notification parameters */
> > + iocb->ki_notify.notify = event.sigev_notify;
> > + iocb->ki_notify.signo = event.sigev_signo;
> > + iocb->ki_notify.value = event.sigev_value;
> > +
> > + /* Now get the notification target */
> > + read_lock(&tasklist_lock);
> > +
> > + if ((target = good_sigevent(&event))) {
> > +
> > + spin_lock_irqsave(&target->sighand->siglock, flags);
> > +
> > + if (!(target->flags & PF_EXITING)) {
> > + iocb->ki_notify.target = target;
> > +
> > + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> > +
>
> I don't think there's a need to have the siglock for checking PF_EXITING,
> and I also don't see why you need it for the assignment to
> iocb->ki_notify.target. What is it intended to protect?

Right, some remnant that managed to sneak through, will fix.

>
> > + /*
> > + * Get a ref on the task. It is dropped in really_put_req()
> > + * when we're done with the iocb, be it from the normal
> > + * completion path, the cancellation path or an error path.
> > + */
> > + if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
> > + get_task_struct(target);
> > + } else {
> > + spin_unlock_irqrestore(&target->sighand->siglock, flags);
> > + error = -EINVAL;
> > + }
> > + } else
> > + error = -EINVAL;
> > +
> > + read_unlock(&tasklist_lock);
> > +
> > + if (!error) {
> > + /*
> > + * Alloc a sigqueue for this request
> > + *
> > + * NOTE: we cannot free the sigqueue in the completion path as
> > + * the signal may not have been delivered to the target task.
> > + * Therefore it has to be freed in __sigqueue_free() when the
> > + * signal is collected if si_code is SI_ASYNCIO.
> > + */
> > + if (unlikely(!(iocb->ki_sigq = sigqueue_alloc())))
> > + error = -EAGAIN;
>
> We leak the task reference here.

I don't think so, the reference is dropped in really_pu_req() when
the iocb is freed, unless I missed something.

>
> All in all aio_setup_sigevent is a big mess and the style is very odd.
> It's also overcommented on what's beeing done instead of why.
> I've attached an draft on how it should look like below.
>
> > + /* handle setting up the sigevent for POSIX AIO signals */
> > + req->ki_notify.notify = SIGEV_NONE;
> > +
> > + if (iocb->aio_sigeventp) {
> > + ret = aio_setup_sigevent(req,
> > + (struct sigevent __user *)(unsigned long)
> > + iocb->aio_sigeventp);
> > + if (ret)
> > + goto out_put_req;
> > + }
> > +
> > ret = aio_setup_iocb(req);
> >
> > if (ret)
> > @@ -1610,6 +1729,10 @@ int fastcall io_submit_one(struct kioctx
> > return 0;
> >
> > out_put_req:
> > + /* Undo the sigqueue alloc if someting went bad */
> > + if (req->ki_sigq)
> > + sigqueue_free(req->ki_sigq);
> > +
>
> Please put this after a separate label, so only callers after
> aio_setup_sigevent try to check ki_sigq.


Okay, cleaner that way.
>
>
>
> static long aio_setup_sigevent(struct kiocb *iocb,
> struct sigevent __user *user_event)
> {
> sigevent_t event;
> struct task_struct *target;
> unsigned long flags;
>
> if (copy_from_user(&event, user_event, sizeof (event)))
> return -EFAULT;
>
> if (event.sigev_notify == SIGEV_NONE)
> return 0;
>
> iocb->ki_notify.notify = event.sigev_notify;
> iocb->ki_notify.signo = event.sigev_signo;
> iocb->ki_notify.value = event.sigev_value;
>
> read_lock(&tasklist_lock);
> target = good_sigevent(&event);
> if (unlikely(!target || (target->flags & PF_EXITING)))
> goto out_unlock;
> iocb->ki_notify.target = target;
>
> if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID)) {
> /*
> * This reference will be dropped when we're done with
> * the request.
> */
> get_task_struct(target);
> }
> read_unlock(&tasklist_lock);
>
> /*
> * NOTE: we cannot free the sigqueue in the completion path as
> * the signal may not have been delivered to the target task.
> * Therefore it has to be freed in __sigqueue_free() when the
> * signal is collected if si_code is SI_ASYNCIO.
> */
> iocb->ki_sigq = sigqueue_alloc();
> if (unlikely(!iocb->ki_sigq)) {
> put_task_struct(target);
> return -EAGAIN;
> }
>
> return 0;
> out_unlock:
> read_unlock(&tasklist_lock);
> return -EINVAL;
> }
>

Cool, clearly better without taking the siglock, but again I don't
think the put_task_struct() is needed here.

Thanks,

S?bastien.


2006-11-10 09:21:04

by Sébastien Dugué

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

On Thu, 2006-11-09 at 11:27 -0800, Badari Pulavarty wrote:
> On Thu, 2006-11-09 at 19:08 +0000, Christoph Hellwig wrote:
>
> Looks much better :)

Indeed ;-)

>
> >
> > static long aio_setup_sigevent(struct kiocb *iocb,
> > struct sigevent __user *user_event)
> > {
> > sigevent_t event;
> > struct task_struct *target;
> > unsigned long flags;
> >
> > if (copy_from_user(&event, user_event, sizeof (event)))
> > return -EFAULT;
> >
> > if (event.sigev_notify == SIGEV_NONE)
> > return 0;
> >
> > iocb->ki_notify.notify = event.sigev_notify;
>
> Don't we want to verify to make sure that we are accepting only
> SIGEV_SIGNAL or SIGEV_THREAD_ID and return -EINVAL, if some
> one passes invalid event ? Like
>
> if ((event.sigev_notify != SIGEV_SIGNAL) &&
> (event.sigev_notify != SIGEV_THREAD_ID))
> return -EINVAL;

That check is done in good_sigevent() below.

>
> > iocb->ki_notify.signo = event.sigev_signo;
> > iocb->ki_notify.value = event.sigev_value;
> >
> > read_lock(&tasklist_lock);
> > target = good_sigevent(&event);
> > if (unlikely(!target || (target->flags & PF_EXITING)))
> > goto out_unlock;
> > iocb->ki_notify.target = target;
> >
> > if (iocb->ki_notify.notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID)) {
> > /*
> > * This reference will be dropped when we're done with
> > * the request.
> > */
> > get_task_struct(target);
> > }
> > read_unlock(&tasklist_lock);
> >
> > /*
> > * NOTE: we cannot free the sigqueue in the completion path as
> > * the signal may not have been delivered to the target task.
> > * Therefore it has to be freed in __sigqueue_free() when the
> > * signal is collected if si_code is SI_ASYNCIO.
> > */
> > iocb->ki_sigq = sigqueue_alloc();
> > if (unlikely(!iocb->ki_sigq)) {
> > put_task_struct(target);
> > return -EAGAIN;
> > }
> >
> > return 0;
> > out_unlock:
> > read_unlock(&tasklist_lock);
> > return -EINVAL;
> > }

Thanks,

S?bastien.

2006-11-10 09:23:15

by Sébastien Dugué

[permalink] [raw]
Subject: Re: [PATCH -mm 3/3][AIO] - AIO completion signal notification

On Thu, 2006-11-09 at 11:18 -0800, Badari Pulavarty wrote:
> On Thu, 2006-11-09 at 16:59 +0100, S?bastien Dugu? wrote:
>
> > @@ -1549,8 +1657,7 @@ int fastcall io_submit_one(struct kioctx
> > ssize_t ret;
> >
> > /* enforce forwards compatibility on users */
> > - if (unlikely(iocb->aio_reserved1 || iocb->aio_reserved2 ||
> > - iocb->aio_reserved3)) {
> > + if (unlikely(iocb->aio_reserved1)) {
> > pr_debug("EINVAL: io_submit: reserve field set\n");
> > return -EINVAL;
>
> Is there a reason for not checking "aio_reserved3" ?
> You are still not using it. Right ?
>

Yep, forgot this one.

Thanks,

S?bastien.