Missed out linux-aio & linux-fs-devel lists. Forwarding.
Comments ?
Suzuki K P
Linux Technology Center,
IBM Software Labs, India.
On Thu, Mar 09, 2006 at 01:24:38PM +0530, Suzuki wrote:
>
> Missed out linux-aio & linux-fs-devel lists. Forwarding.
>
> Comments ?
I've seen this too. The problem is that __generic_file_aio_read can return
with or without the i_mutex locked in the direct I/O case for filesystems
that set DIO_OWN_LOCKING. It's a nasty one and I haven't found a better solution
than copying lots of code from filemap.c into xfs.
On Thu, Mar 09, 2006 at 12:03:06PM +0000, Christoph Hellwig wrote:
> On Thu, Mar 09, 2006 at 01:24:38PM +0530, Suzuki wrote:
> >
> > Missed out linux-aio & linux-fs-devel lists. Forwarding.
> >
> > Comments ?
>
> I've seen this too. The problem is that __generic_file_aio_read can return
> with or without the i_mutex locked in the direct I/O case for filesystems
> that set DIO_OWN_LOCKING.
Not for reads AFAICT - __generic_file_aio_read + own-locking
should always have released i_mutex at the end of the direct
read - are you thinking of writes or have I missed something?
> It's a nasty one and I haven't found a better solution
> than copying lots of code from filemap.c into xfs.
Er, eek? Hopefully thats not needed - from my reading of the
code, all the i_mutex locking for direct reads lives inside
direct-io.c, not filemap.c -- is the solution from my other
mail not workable? (isn't it only writes that has the wierd
buffered I/O fallback + i_sem/i_mutex locking interaction?).
thanks.
--
Nathan
On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> Not for reads AFAICT - __generic_file_aio_read + own-locking
> should always have released i_mutex at the end of the direct
> read - are you thinking of writes or have I missed something?
if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
will return with i_mutex still locked. Note that checking for negative
return values is not enough as __blockdev_direct_IO can return errors
aswell.
On Thu, Mar 09, 2006 at 10:42:19PM +0000, Christoph Hellwig wrote:
> On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> > Not for reads AFAICT - __generic_file_aio_read + own-locking
> > should always have released i_mutex at the end of the direct
> > read - are you thinking of writes or have I missed something?
>
> if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
> will return with i_mutex still locked. Note that checking for negative
> return values is not enough as __blockdev_direct_IO can return errors
> aswell.
*groan* - right you are. Another option may be to have the
generic dio+own-locking case reacquire i_mutex if it drops
it, before returning... thoughts? Seems alot less invasive
than the filemap.c code dup'ing thing.
cheers.
--
Nathan
On Fri, Mar 10, 2006 at 10:14:22AM +1100, Nathan Scott wrote:
> On Thu, Mar 09, 2006 at 10:42:19PM +0000, Christoph Hellwig wrote:
> > On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> > > Not for reads AFAICT - __generic_file_aio_read + own-locking
> > > should always have released i_mutex at the end of the direct
> > > read - are you thinking of writes or have I missed something?
> >
> > if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
> > will return with i_mutex still locked. Note that checking for negative
> > return values is not enough as __blockdev_direct_IO can return errors
> > aswell.
>
> *groan* - right you are. Another option may be to have the
> generic dio+own-locking case reacquire i_mutex if it drops
> it, before returning... thoughts? Seems alot less invasive
> than the filemap.c code dup'ing thing.
Something like this (works OK for me)...
cheers.
--
Nathan
Index: 2.6.x-xfs/fs/direct-io.c
===================================================================
--- 2.6.x-xfs.orig/fs/direct-io.c
+++ 2.6.x-xfs/fs/direct-io.c
@@ -1155,15 +1155,16 @@ direct_io_worker(int rw, struct kiocb *i
* For writes, i_mutex is not held on entry; it is never taken.
*
* DIO_LOCKING (simple locking for regular files)
- * For writes we are called under i_mutex and return with i_mutex held, even though
- * it is internally dropped.
+ * For writes we are called under i_mutex and return with i_mutex held, even
+ * though it is internally dropped.
* For reads, i_mutex is not held on entry, but it is taken and dropped before
* returning.
*
* DIO_OWN_LOCKING (filesystem provides synchronisation and handling of
* uninitialised data, allowing parallel direct readers and writers)
* For writes we are called without i_mutex, return without it, never touch it.
- * For reads, i_mutex is held on entry and will be released before returning.
+ * For reads we are called under i_mutex and return with i_mutex held, even
+ * though it may be internally dropped.
*
* Additional i_alloc_sem locking requirements described inline below.
*/
@@ -1182,7 +1183,8 @@ __blockdev_direct_IO(int rw, struct kioc
ssize_t retval = -EINVAL;
loff_t end = offset;
struct dio *dio;
- int reader_with_isem = (rw == READ && dio_lock_type == DIO_OWN_LOCKING);
+ int release_i_mutex = 0;
+ int acquire_i_mutex = 0;
if (rw & WRITE)
current->flags |= PF_SYNCWRITE;
@@ -1225,7 +1227,6 @@ __blockdev_direct_IO(int rw, struct kioc
* writers need to grab i_alloc_sem only (i_mutex is already held)
* For regular files using DIO_OWN_LOCKING,
* neither readers nor writers take any locks here
- * (i_mutex is already held and release for writers here)
*/
dio->lock_type = dio_lock_type;
if (dio_lock_type != DIO_NO_LOCKING) {
@@ -1236,7 +1237,7 @@ __blockdev_direct_IO(int rw, struct kioc
mapping = iocb->ki_filp->f_mapping;
if (dio_lock_type != DIO_OWN_LOCKING) {
mutex_lock(&inode->i_mutex);
- reader_with_isem = 1;
+ release_i_mutex = 1;
}
retval = filemap_write_and_wait_range(mapping, offset,
@@ -1248,7 +1249,7 @@ __blockdev_direct_IO(int rw, struct kioc
if (dio_lock_type == DIO_OWN_LOCKING) {
mutex_unlock(&inode->i_mutex);
- reader_with_isem = 0;
+ acquire_i_mutex = 1;
}
}
@@ -1269,11 +1270,13 @@ __blockdev_direct_IO(int rw, struct kioc
nr_segs, blkbits, get_blocks, end_io, dio);
if (rw == READ && dio_lock_type == DIO_LOCKING)
- reader_with_isem = 0;
+ release_i_mutex = 0;
out:
- if (reader_with_isem)
+ if (release_i_mutex)
mutex_unlock(&inode->i_mutex);
+ else if (acquire_i_mutex)
+ mutex_lock(&inode->i_mutex);
if (rw & WRITE)
current->flags &= ~PF_SYNCWRITE;
return retval;
On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> Something like this (works OK for me)...
Yeah, that should work for now. But long-term we really need to redo
direct I/O locking to have a common scheme for all filesystems. I've heard
birds whistling RH patches yet another scheme into RHEL4 for GFS an it's
definitly already far too complex now.
On Fri, Mar 10, 2006 at 03:49:25PM +0000, Christoph Hellwig wrote:
> On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> > Something like this (works OK for me)...
>
> Yeah, that should work for now. But long-term we really need to redo
> direct I/O locking to have a common scheme for all filesystems. I've heard
> birds whistling RH patches yet another scheme into RHEL4 for GFS an it's
> definitly already far too complex now.
Yup, getting rid of the need for all these confusing locking
modes was one of the objectives in mind for DIO simplification.
(http://www.kernel.org/pub/linux/kernel/people/suparna/DIO-simplify.txt)
Once we have an efficient range locking or similar mechanism in place
(Chris Mason is working on a patch), then it should be possible to push
out all of the i_mutex locking to higher level routines, outside of
direct-io.c.
Longer term, it would be nice to be able to rethink and further simplify
the whole _nolock equiv versions for VFS write methods. Especially the
percolation down to sync_page_range_nolock, etc.
Regards
Suparna
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [email protected]. For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"[email protected]">[email protected]</a>
--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Lab, India
On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> On Fri, Mar 10, 2006 at 10:14:22AM +1100, Nathan Scott wrote:
> > On Thu, Mar 09, 2006 at 10:42:19PM +0000, Christoph Hellwig wrote:
> > > On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> > > > Not for reads AFAICT - __generic_file_aio_read + own-locking
> > > > should always have released i_mutex at the end of the direct
> > > > read - are you thinking of writes or have I missed something?
> > >
> > > if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
> > > will return with i_mutex still locked. Note that checking for negative
> > > return values is not enough as __blockdev_direct_IO can return errors
> > > aswell.
> >
> > *groan* - right you are. Another option may be to have the
> > generic dio+own-locking case reacquire i_mutex if it drops
> > it, before returning... thoughts? Seems alot less invasive
> > than the filemap.c code dup'ing thing.
>
> Something like this (works OK for me)...
Is this 2.6.16 material?
> cheers.
> Nathan
>...
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
On Fri, Mar 17, 2006 at 06:22:10PM +0100, Adrian Bunk wrote:
> On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> > Something like this (works OK for me)...
>
> Is this 2.6.16 material?
Its been merged already.
cheers.
--
Nathan
On Sat, Mar 18, 2006 at 02:34:44PM +1100, Nathan Scott wrote:
> On Fri, Mar 17, 2006 at 06:22:10PM +0100, Adrian Bunk wrote:
> > On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> > > Something like this (works OK for me)...
> >
> > Is this 2.6.16 material?
>
> Its been merged already.
Ups, sorry for missing this.
> cheers.
> Nathan
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed