2008-01-03 09:04:50

by Rusty Russell

[permalink] [raw]
Subject: [PATCH] aio: partial write should not return error code.

When an AIO write gets an error after writing some data (eg. ENOSPC),
it should return the amount written already, not the error. Just like
write() is supposed to.

This was found by the libaio test suite.

Signed-off-by: Rusty Russell <[email protected]>

diff -r 18802689361a fs/aio.c
--- a/fs/aio.c Thu Jan 03 15:22:24 2008 +1100
+++ b/fs/aio.c Thu Jan 03 18:05:25 2008 +1100
@@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct
/* This means we must have transferred all that we could */
/* No need to retry anymore */
if ((ret == 0) || (iocb->ki_left == 0))
+ ret = iocb->ki_nbytes - iocb->ki_left;
+
+ /* If we managed to write some out we return that, rather than
+ * the eventual error. */
+ if (opcode == IOCB_CMD_PWRITEV
+ && ret < 0
+ && iocb->ki_nbytes - iocb->ki_left)
ret = iocb->ki_nbytes - iocb->ki_left;

return ret;


2008-01-03 09:05:22

by Rusty Russell

[permalink] [raw]
Subject: [PATCH] aio: negative offset should return -EINVAL

An AIO read or write should return -EINVAL if the offset is negative.
This check matches the one in pread and pwrite.

This was found by the libaio test suite.

Signed-off-by: Rusty Russell <[email protected]>

diff -r 18802689361a fs/aio.c
--- a/fs/aio.c Thu Jan 03 15:22:24 2008 +1100
+++ b/fs/aio.c Thu Jan 03 18:05:25 2008 +1100
@@ -1330,6 +1330,10 @@ static ssize_t aio_rw_vect_retry(struct
opcode = IOCB_CMD_PWRITEV;
}

+ /* This matches the pread()/pwrite() logic */
+ if (iocb->ki_pos < 0)
+ return -EINVAL;
+
do {
ret = rw_op(iocb, &iocb->ki_iovec[iocb->ki_cur_seg],
iocb->ki_nr_segs - iocb->ki_cur_seg,

2008-01-03 20:04:58

by Zach Brown

[permalink] [raw]
Subject: Re: [PATCH] aio: partial write should not return error code.

Rusty Russell wrote:
> When an AIO write gets an error after writing some data (eg. ENOSPC),
> it should return the amount written already, not the error. Just like
> write() is supposed to.

Andrew, please don't queue this fix. I think the bug is valid but the
patch is subtly dangerous.

> diff -r 18802689361a fs/aio.c
> --- a/fs/aio.c Thu Jan 03 15:22:24 2008 +1100
> +++ b/fs/aio.c Thu Jan 03 18:05:25 2008 +1100
> @@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct
> /* This means we must have transferred all that we could */
> /* No need to retry anymore */
> if ((ret == 0) || (iocb->ki_left == 0))
> + ret = iocb->ki_nbytes - iocb->ki_left;
> +
> + /* If we managed to write some out we return that, rather than
> + * the eventual error. */
> + if (opcode == IOCB_CMD_PWRITEV
> + && ret < 0
> + && iocb->ki_nbytes - iocb->ki_left)
> ret = iocb->ki_nbytes - iocb->ki_left;

This doesn't account for the (sigh) -EIOCB* error codes. They must be
returned to the caller so that it can properly handle the iocb reference
counting. Failure to do so can lead to oopses.

To be fair, I think you'll have a really hard time finding an
->aio_write() implementation which would return partial progress and
*then* one of the magical errnos. But the infrastructure does allow it.

So maybe we could get a helper in aio.h that abstracts out the

(ret < 0 && ret != -EIOCBQUEUED && ret != -EIOCBRETRY)

condition. Then I think this patch would be fine.

I assigned a bug to remind myself to revisit this if you aren't excited
by continuing with the patch:

http://bugzilla.kernel.org/show_bug.cgi?id=9681

- z

2008-01-03 20:17:47

by Zach Brown

[permalink] [raw]
Subject: Re: [PATCH] aio: negative offset should return -EINVAL

Rusty Russell wrote:
> An AIO read or write should return -EINVAL if the offset is negative.
> This check matches the one in pread and pwrite.
>
> This was found by the libaio test suite.
>
> Signed-off-by: Rusty Russell <[email protected]>

This looks fine to me.

Signed-off-by: Zach Brown <[email protected]>

- z

2008-01-04 03:10:35

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] aio: partial write should not return error code.

On Friday 04 January 2008 07:04:30 Zach Brown wrote:
> Rusty Russell wrote:
> > When an AIO write gets an error after writing some data (eg. ENOSPC),
> > it should return the amount written already, not the error. Just like
> > write() is supposed to.
>
> Andrew, please don't queue this fix. I think the bug is valid but the
> patch is subtly dangerous.
>
> > diff -r 18802689361a fs/aio.c
> > --- a/fs/aio.c Thu Jan 03 15:22:24 2008 +1100
> > +++ b/fs/aio.c Thu Jan 03 18:05:25 2008 +1100
> > @@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct
> > /* This means we must have transferred all that we could */
> > /* No need to retry anymore */
> > if ((ret == 0) || (iocb->ki_left == 0))
> > + ret = iocb->ki_nbytes - iocb->ki_left;
> > +
> > + /* If we managed to write some out we return that, rather than
> > + * the eventual error. */
> > + if (opcode == IOCB_CMD_PWRITEV
> > + && ret < 0
> > + && iocb->ki_nbytes - iocb->ki_left)
> > ret = iocb->ki_nbytes - iocb->ki_left;
>
> This doesn't account for the (sigh) -EIOCB* error codes. They must be
> returned to the caller so that it can properly handle the iocb reference
> counting. Failure to do so can lead to oopses.
>
> To be fair, I think you'll have a really hard time finding an
> ->aio_write() implementation which would return partial progress and
> *then* one of the magical errnos. But the infrastructure does allow it.

Erk, thanks.

> So maybe we could get a helper in aio.h that abstracts out the
>
> (ret < 0 && ret != -EIOCBQUEUED && ret != -EIOCBRETRY)
>
> condition. Then I think this patch would be fine.
>
> I assigned a bug to remind myself to revisit this if you aren't excited
> by continuing with the patch:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=9681

No, that's fine, here is the new one:

When an AIO write gets a non-retry error after writing some data
(eg. ENOSPC), it should return the amount written already, not the
error. Just like write() is supposed to.

This was found by the libaio test suite.

Signed-off-by: Rusty Russell <[email protected]>
---
fs/aio.c | 7 +++++++
1 file changed, 7 insertions(+)

diff -r 18802689361a fs/aio.c
--- a/fs/aio.c Thu Jan 03 15:22:24 2008 +1100
+++ b/fs/aio.c Thu Jan 03 18:05:25 2008 +1100
@@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct
/* This means we must have transferred all that we could */
/* No need to retry anymore */
if ((ret == 0) || (iocb->ki_left == 0))
+ ret = iocb->ki_nbytes - iocb->ki_left;
+
+ /* If we managed to write some out we return that, rather than
+ * the eventual error. */
+ if (opcode == IOCB_CMD_PWRITEV
+ && ret < 0 && ret != -EIOCBQUEUED && ret != -EIOCBRETRY
+ && iocb->ki_nbytes - iocb->ki_left)
ret = iocb->ki_nbytes - iocb->ki_left;

return ret;

2008-01-04 18:19:48

by Zach Brown

[permalink] [raw]
Subject: Re: [PATCH] aio: partial write should not return error code.


>
> No, that's fine, here is the new one:
>
> When an AIO write gets a non-retry error after writing some data
> (eg. ENOSPC), it should return the amount written already, not the
> error. Just like write() is supposed to.
>
> This was found by the libaio test suite.
>
> Signed-off-by: Rusty Russell <[email protected]>

This looks good, feel free to push this from your tree.

Acked-By: Zach Brown <[email protected]>

- z