2010-06-24 22:15:34

by Jiaying Zhang

[permalink] [raw]
Subject: [PATCH] EXT4: move aio completion after unwritten extent conversion


EXT4: move aio completion after unwritten extent conversion

This patch is to be applied upon Christoph's "direct-io: move aio_complete
into ->end_io" patch. It adds iocb and result fields to struct ext4_io_end_t,
so that we can call aio_complete from ext4_end_io_nolock() after the extent
conversion has finished.

I have verified with Christoph's aio-dio test that used to fail after a few
runs on an original kernel but now succeeds on the patched kernel.

See http://thread.gmane.org/gmane.comp.file-systems.ext4/19659 for details.

Signed-off-by: Jiaying Zhang <[email protected]>

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 19a4de5..4c9f05d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -167,13 +167,15 @@ struct mpage_da_data {
};
#define EXT4_IO_UNWRITTEN 0x1
typedef struct ext4_io_end {
- struct list_head list; /* per-file finished AIO list */
+ struct list_head list; /* per-file finished IO list */
struct inode *inode; /* file being written to */
unsigned int flag; /* unwritten or not */
struct page *page; /* page struct for buffer write */
loff_t offset; /* offset in the file */
ssize_t size; /* size of the extent */
struct work_struct work; /* data work queue */
+ struct kiocb *iocb; /* iocb struct for AIO */
+ int result; /* error value for AIO */
} ext4_io_end_t;

/*
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0afc8c1..3fb64b9 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3668,6 +3668,8 @@ static int ext4_end_io_nolock(ext4_io_end_t *io)
return ret;
}

+ if (io->iocb)
+ aio_complete(io->iocb, io->result, 0);
/* clear the DIO AIO unwritten flag */
io->flag = 0;
return ret;
@@ -3767,6 +3769,8 @@ static ext4_io_end_t *ext4_init_io_end (struct inode *inode, gfp_t flags)
io->offset = 0;
io->size = 0;
io->page = NULL;
+ io->iocb = NULL;
+ io->result = 0;
INIT_WORK(&io->work, ext4_end_io_work);
INIT_LIST_HEAD(&io->list);
}
@@ -3796,12 +3800,18 @@ static void ext4_end_io_dio(struct kiocb *iocb, loff_t offset,
if (io_end->flag != EXT4_IO_UNWRITTEN){
ext4_free_io_end(io_end);
iocb->private = NULL;
- goto out;
+out:
+ if (is_async)
+ aio_complete(iocb, ret, 0);
+ return;
}

io_end->offset = offset;
io_end->size = size;
- io_end->flag = EXT4_IO_UNWRITTEN;
+ if (is_async) {
+ io_end->iocb = iocb;
+ io_end->result = ret;
+ }
wq = EXT4_SB(io_end->inode->i_sb)->dio_unwritten_wq;

/* queue the work to convert unwritten extents to written */
@@ -3813,9 +3823,6 @@ static void ext4_end_io_dio(struct kiocb *iocb, loff_t offset,
list_add_tail(&io_end->list, &ei->i_completed_io_list);
spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
iocb->private = NULL;
-out:
- if (is_async)
- aio_complete(iocb, ret, 0);
}

static void ext4_end_io_buffer_write(struct buffer_head *bh, int uptodate)


2010-06-28 14:49:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] EXT4: move aio completion after unwritten extent conversion

> if (io_end->flag != EXT4_IO_UNWRITTEN){
> ext4_free_io_end(io_end);
> iocb->private = NULL;
> - goto out;
> +out:
> + if (is_async)
> + aio_complete(iocb, ret, 0);
> + return;

I'd suggest keeping the out label at the end of the function. Without
that the code gets unreadable very quickly.

> io_end->size = size;
> - io_end->flag = EXT4_IO_UNWRITTEN;

Why is this initialization removed?


2010-06-28 18:41:16

by Jiaying Zhang

[permalink] [raw]
Subject: Re: [PATCH] EXT4: move aio completion after unwritten extent conversion

On Mon, Jun 28, 2010 at 7:49 AM, Christoph Hellwig <[email protected]> wrote:
>
> > ? ? ? if (io_end->flag != EXT4_IO_UNWRITTEN){
> > ? ? ? ? ? ? ? ext4_free_io_end(io_end);
> > ? ? ? ? ? ? ? iocb->private = NULL;
> > - ? ? ? ? ? ? goto out;
> > +out:
> > + ? ? ? ? ? ? if (is_async)
> > + ? ? ? ? ? ? ? ? ? ? aio_complete(iocb, ret, 0);
> > + ? ? ? ? ? ? return;
>
> I'd suggest keeping the out label at the end of the function. ?Without
> that the code gets unreadable very quickly.
>
I am fine with either. Having the 'out' exit label here saves us a jump in the
common code path while keeping it at the end of the function comply better
with the kernel code style. I will let Ted decide which one to take.

> > ? ? ? io_end->size = size;
> > - ? ? io_end->flag = EXT4_IO_UNWRITTEN;
>
> Why is this initialization removed?
>
This change is not related to the bug fix. I just realized that this
initialization seems to be unnecessary because we should already
return in the case that (io_end->flag != EXT4_IO_UNWRITTEN)
at the beginning of the function.