2006-03-09 07:49:00

by suzuki

[permalink] [raw]
Subject: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests


Missed out linux-aio & linux-fs-devel lists. Forwarding.

Comments ?


Suzuki K P
Linux Technology Center,
IBM Software Labs, India.


Attachments:
[RFC] Badness in __mutex_unlock_slowpath with XFS stress tests.eml (1.98 kB)

2006-03-09 12:05:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Thu, Mar 09, 2006 at 01:24:38PM +0530, Suzuki wrote:
>
> Missed out linux-aio & linux-fs-devel lists. Forwarding.
>
> Comments ?

I've seen this too. The problem is that __generic_file_aio_read can return
with or without the i_mutex locked in the direct I/O case for filesystems
that set DIO_OWN_LOCKING. It's a nasty one and I haven't found a better solution
than copying lots of code from filemap.c into xfs.

2006-03-09 22:34:38

by Nathan Scott

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Thu, Mar 09, 2006 at 12:03:06PM +0000, Christoph Hellwig wrote:
> On Thu, Mar 09, 2006 at 01:24:38PM +0530, Suzuki wrote:
> >
> > Missed out linux-aio & linux-fs-devel lists. Forwarding.
> >
> > Comments ?
>
> I've seen this too. The problem is that __generic_file_aio_read can return
> with or without the i_mutex locked in the direct I/O case for filesystems
> that set DIO_OWN_LOCKING.

Not for reads AFAICT - __generic_file_aio_read + own-locking
should always have released i_mutex at the end of the direct
read - are you thinking of writes or have I missed something?

> It's a nasty one and I haven't found a better solution
> than copying lots of code from filemap.c into xfs.

Er, eek? Hopefully thats not needed - from my reading of the
code, all the i_mutex locking for direct reads lives inside
direct-io.c, not filemap.c -- is the solution from my other
mail not workable? (isn't it only writes that has the wierd
buffered I/O fallback + i_sem/i_mutex locking interaction?).

thanks.

--
Nathan

2006-03-09 22:42:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> Not for reads AFAICT - __generic_file_aio_read + own-locking
> should always have released i_mutex at the end of the direct
> read - are you thinking of writes or have I missed something?

if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
will return with i_mutex still locked. Note that checking for negative
return values is not enough as __blockdev_direct_IO can return errors
aswell.

2006-03-09 23:18:21

by Nathan Scott

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Thu, Mar 09, 2006 at 10:42:19PM +0000, Christoph Hellwig wrote:
> On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> > Not for reads AFAICT - __generic_file_aio_read + own-locking
> > should always have released i_mutex at the end of the direct
> > read - are you thinking of writes or have I missed something?
>
> if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
> will return with i_mutex still locked. Note that checking for negative
> return values is not enough as __blockdev_direct_IO can return errors
> aswell.

*groan* - right you are. Another option may be to have the
generic dio+own-locking case reacquire i_mutex if it drops
it, before returning... thoughts? Seems alot less invasive
than the filemap.c code dup'ing thing.

cheers.

--
Nathan

2006-03-10 00:54:31

by Nathan Scott

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Fri, Mar 10, 2006 at 10:14:22AM +1100, Nathan Scott wrote:
> On Thu, Mar 09, 2006 at 10:42:19PM +0000, Christoph Hellwig wrote:
> > On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> > > Not for reads AFAICT - __generic_file_aio_read + own-locking
> > > should always have released i_mutex at the end of the direct
> > > read - are you thinking of writes or have I missed something?
> >
> > if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
> > will return with i_mutex still locked. Note that checking for negative
> > return values is not enough as __blockdev_direct_IO can return errors
> > aswell.
>
> *groan* - right you are. Another option may be to have the
> generic dio+own-locking case reacquire i_mutex if it drops
> it, before returning... thoughts? Seems alot less invasive
> than the filemap.c code dup'ing thing.

Something like this (works OK for me)...

cheers.

--
Nathan


Index: 2.6.x-xfs/fs/direct-io.c
===================================================================
--- 2.6.x-xfs.orig/fs/direct-io.c
+++ 2.6.x-xfs/fs/direct-io.c
@@ -1155,15 +1155,16 @@ direct_io_worker(int rw, struct kiocb *i
* For writes, i_mutex is not held on entry; it is never taken.
*
* DIO_LOCKING (simple locking for regular files)
- * For writes we are called under i_mutex and return with i_mutex held, even though
- * it is internally dropped.
+ * For writes we are called under i_mutex and return with i_mutex held, even
+ * though it is internally dropped.
* For reads, i_mutex is not held on entry, but it is taken and dropped before
* returning.
*
* DIO_OWN_LOCKING (filesystem provides synchronisation and handling of
* uninitialised data, allowing parallel direct readers and writers)
* For writes we are called without i_mutex, return without it, never touch it.
- * For reads, i_mutex is held on entry and will be released before returning.
+ * For reads we are called under i_mutex and return with i_mutex held, even
+ * though it may be internally dropped.
*
* Additional i_alloc_sem locking requirements described inline below.
*/
@@ -1182,7 +1183,8 @@ __blockdev_direct_IO(int rw, struct kioc
ssize_t retval = -EINVAL;
loff_t end = offset;
struct dio *dio;
- int reader_with_isem = (rw == READ && dio_lock_type == DIO_OWN_LOCKING);
+ int release_i_mutex = 0;
+ int acquire_i_mutex = 0;

if (rw & WRITE)
current->flags |= PF_SYNCWRITE;
@@ -1225,7 +1227,6 @@ __blockdev_direct_IO(int rw, struct kioc
* writers need to grab i_alloc_sem only (i_mutex is already held)
* For regular files using DIO_OWN_LOCKING,
* neither readers nor writers take any locks here
- * (i_mutex is already held and release for writers here)
*/
dio->lock_type = dio_lock_type;
if (dio_lock_type != DIO_NO_LOCKING) {
@@ -1236,7 +1237,7 @@ __blockdev_direct_IO(int rw, struct kioc
mapping = iocb->ki_filp->f_mapping;
if (dio_lock_type != DIO_OWN_LOCKING) {
mutex_lock(&inode->i_mutex);
- reader_with_isem = 1;
+ release_i_mutex = 1;
}

retval = filemap_write_and_wait_range(mapping, offset,
@@ -1248,7 +1249,7 @@ __blockdev_direct_IO(int rw, struct kioc

if (dio_lock_type == DIO_OWN_LOCKING) {
mutex_unlock(&inode->i_mutex);
- reader_with_isem = 0;
+ acquire_i_mutex = 1;
}
}

@@ -1269,11 +1270,13 @@ __blockdev_direct_IO(int rw, struct kioc
nr_segs, blkbits, get_blocks, end_io, dio);

if (rw == READ && dio_lock_type == DIO_LOCKING)
- reader_with_isem = 0;
+ release_i_mutex = 0;

out:
- if (reader_with_isem)
+ if (release_i_mutex)
mutex_unlock(&inode->i_mutex);
+ else if (acquire_i_mutex)
+ mutex_lock(&inode->i_mutex);
if (rw & WRITE)
current->flags &= ~PF_SYNCWRITE;
return retval;

2006-03-10 15:49:35

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> Something like this (works OK for me)...

Yeah, that should work for now. But long-term we really need to redo
direct I/O locking to have a common scheme for all filesystems. I've heard
birds whistling RH patches yet another scheme into RHEL4 for GFS an it's
definitly already far too complex now.

2006-03-14 04:46:06

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Fri, Mar 10, 2006 at 03:49:25PM +0000, Christoph Hellwig wrote:
> On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> > Something like this (works OK for me)...
>
> Yeah, that should work for now. But long-term we really need to redo
> direct I/O locking to have a common scheme for all filesystems. I've heard
> birds whistling RH patches yet another scheme into RHEL4 for GFS an it's
> definitly already far too complex now.

Yup, getting rid of the need for all these confusing locking
modes was one of the objectives in mind for DIO simplification.
(http://www.kernel.org/pub/linux/kernel/people/suparna/DIO-simplify.txt)
Once we have an efficient range locking or similar mechanism in place
(Chris Mason is working on a patch), then it should be possible to push
out all of the i_mutex locking to higher level routines, outside of
direct-io.c.

Longer term, it would be nice to be able to rethink and further simplify
the whole _nolock equiv versions for VFS write methods. Especially the
percolation down to sync_page_range_nolock, etc.

Regards
Suparna

>
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [email protected]. For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"[email protected]">[email protected]</a>

--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Lab, India

2006-03-17 17:22:13

by Adrian Bunk

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> On Fri, Mar 10, 2006 at 10:14:22AM +1100, Nathan Scott wrote:
> > On Thu, Mar 09, 2006 at 10:42:19PM +0000, Christoph Hellwig wrote:
> > > On Fri, Mar 10, 2006 at 09:30:42AM +1100, Nathan Scott wrote:
> > > > Not for reads AFAICT - __generic_file_aio_read + own-locking
> > > > should always have released i_mutex at the end of the direct
> > > > read - are you thinking of writes or have I missed something?
> > >
> > > if an error occurs before a_ops->direct_IO is called __generic_file_aio_read
> > > will return with i_mutex still locked. Note that checking for negative
> > > return values is not enough as __blockdev_direct_IO can return errors
> > > aswell.
> >
> > *groan* - right you are. Another option may be to have the
> > generic dio+own-locking case reacquire i_mutex if it drops
> > it, before returning... thoughts? Seems alot less invasive
> > than the filemap.c code dup'ing thing.
>
> Something like this (works OK for me)...

Is this 2.6.16 material?

> cheers.
> Nathan
>...

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-03-18 03:35:43

by Nathan Scott

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Fri, Mar 17, 2006 at 06:22:10PM +0100, Adrian Bunk wrote:
> On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> > Something like this (works OK for me)...
>
> Is this 2.6.16 material?

Its been merged already.

cheers.

--
Nathan

2006-03-18 05:03:34

by Adrian Bunk

[permalink] [raw]
Subject: Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

On Sat, Mar 18, 2006 at 02:34:44PM +1100, Nathan Scott wrote:
> On Fri, Mar 17, 2006 at 06:22:10PM +0100, Adrian Bunk wrote:
> > On Fri, Mar 10, 2006 at 11:50:20AM +1100, Nathan Scott wrote:
> > > Something like this (works OK for me)...
> >
> > Is this 2.6.16 material?
>
> Its been merged already.

Ups, sorry for missing this.

> cheers.
> Nathan

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed