2005-07-11 10:32:58

by Olaf Kirch

[permalink] [raw]
Subject: [PATCH fs/locks 3 of 3] Fix race conditions with file lock vs close

# Subject: Fix race conditions with file lock vs close
#
# When a multithreaded application tries to obtain a lock on a remote
# file server (NFS or other), while another thread closes the file, we
# risk oopsing or running into a BUG(). This can happen if the close
# happens inbetween sys_fcntl() grabbing the file pointer, and before
# the call to __posix_lock_file.
#
# The fix is to change fcntl_setlk* and make them check if the file
# is still open after the lock has been obtained.
#
# Signed-off-by: Olaf Kirch <[email protected]>

Index: linux-2.6.12/fs/fcntl.c
===================================================================
--- linux-2.6.12.orig/fs/fcntl.c 2005-07-08 11:45:17.000000000 +0200
+++ linux-2.6.12/fs/fcntl.c 2005-07-11 12:17:31.000000000 +0200
@@ -288,7 +288,7 @@ static long do_fcntl(int fd, unsigned in
break;
case F_SETLK:
case F_SETLKW:
- err = fcntl_setlk(filp, cmd, (struct flock __user *) arg);
+ err = fcntl_setlk(filp, cmd, (struct flock __user *) arg, fd);
break;
case F_GETOWN:
/*
@@ -376,7 +376,7 @@ asmlinkage long sys_fcntl64(unsigned int
break;
case F_SETLK64:
case F_SETLKW64:
- err = fcntl_setlk64(filp, cmd, (struct flock64 __user *) arg);
+ err = fcntl_setlk64(filp, cmd, (struct flock64 __user *) arg, fd);
break;
default:
err = do_fcntl(fd, cmd, arg, filp);
Index: linux-2.6.12/fs/locks.c
===================================================================
--- linux-2.6.12.orig/fs/locks.c 2005-07-11 12:17:15.000000000 +0200
+++ linux-2.6.12/fs/locks.c 2005-07-11 12:17:31.000000000 +0200
@@ -1591,7 +1591,7 @@ out:
/* Apply the lock described by l to an open file descriptor.
* This implements both the F_SETLK and F_SETLKW commands of fcntl().
*/
-int fcntl_setlk(struct file *filp, unsigned int cmd, struct flock __user *l)
+int fcntl_setlk(struct file *filp, unsigned int cmd, struct flock __user *l, int fd)
{
struct file_lock *file_lock = locks_alloc_lock();
struct flock flock;
@@ -1666,6 +1666,18 @@ int fcntl_setlk(struct file *filp, unsig
break;
}

+ /* Make sure the file is still open, otherwise we need
+ * to release the lock we just claimed. */
+ spin_lock(&current->files->file_lock);
+ if (current->files->fd[fd] != filp) {
+ /* Darn - someone closed the file in the meanwhile. */
+ spin_unlock(&current->files->file_lock);
+ locks_remove_posix(filp, current->files);
+ error = -EBADF;
+ } else {
+ spin_unlock(&current->files->file_lock);
+ }
+
out:
locks_free_lock(file_lock);
return error;
@@ -1722,7 +1734,7 @@ out:
/* Apply the lock described by l to an open file descriptor.
* This implements both the F_SETLK and F_SETLKW commands of fcntl().
*/
-int fcntl_setlk64(struct file *filp, unsigned int cmd, struct flock64 __user *l)
+int fcntl_setlk64(struct file *filp, unsigned int cmd, struct flock64 __user *l, int fd)
{
struct file_lock *file_lock = locks_alloc_lock();
struct flock64 flock;
@@ -1797,6 +1809,18 @@ int fcntl_setlk64(struct file *filp, uns
break;
}

+ /* Make sure the file is still open, otherwise we need
+ * to release the lock we just claimed. */
+ spin_lock(&current->files->file_lock);
+ if (current->files->fd[fd] != filp) {
+ /* Darn - someone closed the file in the meanwhile. */
+ spin_unlock(&current->files->file_lock);
+ locks_remove_posix(filp, current->files);
+ error = -EBADF;
+ } else {
+ spin_unlock(&current->files->file_lock);
+ }
+
out:
locks_free_lock(file_lock);
return error;
Index: linux-2.6.12/include/linux/fs.h
===================================================================
--- linux-2.6.12.orig/include/linux/fs.h 2005-07-08 11:45:21.000000000 +0200
+++ linux-2.6.12/include/linux/fs.h 2005-07-11 12:17:31.000000000 +0200
@@ -689,11 +689,11 @@ extern struct list_head file_lock_list;
#include <linux/fcntl.h>

extern int fcntl_getlk(struct file *, struct flock __user *);
-extern int fcntl_setlk(struct file *, unsigned int, struct flock __user *);
+extern int fcntl_setlk(struct file *, unsigned int, struct flock __user *, int);

#if BITS_PER_LONG == 32
extern int fcntl_getlk64(struct file *, struct flock64 __user *);
-extern int fcntl_setlk64(struct file *, unsigned int, struct flock64 __user *);
+extern int fcntl_setlk64(struct file *, unsigned int, struct flock64 __user *, int);
#endif

extern void send_sigio(struct fown_struct *fown, int fd, int band);

--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-07-11 12:33:05

by Peter Staubach

[permalink] [raw]
Subject: Re: [PATCH fs/locks 3 of 3] Fix race conditions with file lock vs close

--- ./fs/locks.c.org 2005-06-27 08:54:55.000000000 -0400
+++ ./fs/locks.c 2005-06-27 13:24:38.447157408 -0400
@@ -1589,7 +1589,8 @@ out:
/* Apply the lock described by l to an open file descriptor.
* This implements both the F_SETLK and F_SETLKW commands of fcntl().
*/
-int fcntl_setlk(struct file *filp, unsigned int cmd, struct flock __user *l)
+int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd,
+ struct flock __user *l)
{
struct file_lock *file_lock = locks_alloc_lock();
struct flock flock;
@@ -1618,6 +1619,7 @@ int fcntl_setlk(struct file *filp, unsig
goto out;
}

+again:
error = flock_to_posix_lock(filp, file_lock, &flock);
if (error)
goto out;
@@ -1646,25 +1648,34 @@ int fcntl_setlk(struct file *filp, unsig
if (error)
goto out;

- if (filp->f_op && filp->f_op->lock != NULL) {
+ if (filp->f_op && filp->f_op->lock != NULL)
error = filp->f_op->lock(filp, cmd, file_lock);
- goto out;
- }
-
- for (;;) {
- error = __posix_lock_file(inode, file_lock);
- if ((error != -EAGAIN) || (cmd == F_SETLK))
+ else {
+ for (;;) {
+ error = __posix_lock_file(inode, file_lock);
+ if ((error != -EAGAIN) || (cmd == F_SETLK))
+ break;
+ error = wait_event_interruptible(file_lock->fl_wait,
+ !file_lock->fl_next);
+ if (!error)
+ continue;
+
+ locks_delete_block(file_lock);
break;
- error = wait_event_interruptible(file_lock->fl_wait,
- !file_lock->fl_next);
- if (!error)
- continue;
+ }
+ }

- locks_delete_block(file_lock);
- break;
+ /*
+ * Attempt to detect a close/fcntl race and recover by
+ * releasing the lock that was just acquired.
+ */
+ if (!error &&
+ cmd != F_UNLCK && fcheck(fd) != filp && flock.l_type != F_UNLCK) {
+ flock.l_type = F_UNLCK;
+ goto again;
}

- out:
+out:
locks_free_lock(file_lock);
return error;
}
@@ -1722,7 +1733,8 @@ out:
/* Apply the lock described by l to an open file descriptor.
* This implements both the F_SETLK and F_SETLKW commands of fcntl().
*/
-int fcntl_setlk64(struct file *filp, unsigned int cmd, struct flock64 __user *l)
+int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd,
+ struct flock64 __user *l)
{
struct file_lock *file_lock = locks_alloc_lock();
struct flock64 flock;
@@ -1751,6 +1763,7 @@ int fcntl_setlk64(struct file *filp, uns
goto out;
}

+again:
error = flock64_to_posix_lock(filp, file_lock, &flock);
if (error)
goto out;
@@ -1779,22 +1792,31 @@ int fcntl_setlk64(struct file *filp, uns
if (error)
goto out;

- if (filp->f_op && filp->f_op->lock != NULL) {
+ if (filp->f_op && filp->f_op->lock != NULL)
error = filp->f_op->lock(filp, cmd, file_lock);
- goto out;
- }
-
- for (;;) {
- error = __posix_lock_file(inode, file_lock);
- if ((error != -EAGAIN) || (cmd == F_SETLK64))
+ else {
+ for (;;) {
+ error = __posix_lock_file(inode, file_lock);
+ if ((error != -EAGAIN) || (cmd == F_SETLK64))
+ break;
+ error = wait_event_interruptible(file_lock->fl_wait,
+ !file_lock->fl_next);
+ if (!error)
+ continue;
+
+ locks_delete_block(file_lock);
break;
- error = wait_event_interruptible(file_lock->fl_wait,
- !file_lock->fl_next);
- if (!error)
- continue;
+ }
+ }

- locks_delete_block(file_lock);
- break;
+ /*
+ * Attempt to detect a close/fcntl race and recover by
+ * releasing the lock that was just acquired.
+ */
+ if (!error &&
+ cmd != F_UNLCK && fcheck(fd) != filp && flock.l_type != F_UNLCK) {
+ flock.l_type = F_UNLCK;
+ goto again;
}

out:
@@ -1886,12 +1908,7 @@ void locks_remove_flock(struct file *fil

while ((fl = *before) != NULL) {
if (fl->fl_file == filp) {
- /*
- * We might have a POSIX lock that was created at the same time
- * the filp was closed for the last time. Just remove that too,
- * regardless of ownership, since nobody can own it.
- */
- if (IS_FLOCK(fl) || IS_POSIX(fl)) {
+ if (IS_FLOCK(fl)) {
locks_delete_lock(before);
continue;
}
--- ./fs/fcntl.c.org 2005-06-27 08:54:57.000000000 -0400
+++ ./fs/fcntl.c 2005-06-27 13:24:38.471153760 -0400
@@ -290,7 +290,7 @@ static long do_fcntl(int fd, unsigned in
break;
case F_SETLK:
case F_SETLKW:
- err = fcntl_setlk(filp, cmd, (struct flock __user *) arg);
+ err = fcntl_setlk(fd, filp, cmd, (struct flock __user *) arg);
break;
case F_GETOWN:
/*
@@ -378,7 +378,8 @@ asmlinkage long sys_fcntl64(unsigned int
break;
case F_SETLK64:
case F_SETLKW64:
- err = fcntl_setlk64(filp, cmd, (struct flock64 __user *) arg);
+ err = fcntl_setlk64(fd, filp, cmd,
+ (struct flock64 __user *) arg);
break;
default:
err = do_fcntl(fd, cmd, arg, filp);
--- ./include/linux/fs.h.org 2005-06-27 08:54:57.000000000 -0400
+++ ./include/linux/fs.h 2005-06-27 13:24:38.444157864 -0400
@@ -691,11 +691,13 @@ extern struct list_head file_lock_list;
#include <linux/fcntl.h>

extern int fcntl_getlk(struct file *, struct flock __user *);
-extern int fcntl_setlk(struct file *, unsigned int, struct flock __user *);
+extern int fcntl_setlk(unsigned int, struct file *, unsigned int,
+ struct flock __user *);

#if BITS_PER_LONG == 32
extern int fcntl_getlk64(struct file *, struct flock64 __user *);
-extern int fcntl_setlk64(struct file *, unsigned int, struct flock64 __user *);
+extern int fcntl_setlk64(unsigned int, struct file *, unsigned int,
+ struct flock64 __user *);
#endif

extern void send_sigio(struct fown_struct *fown, int fd, int band);


Attachments:
devel (5.36 kB)

2005-07-11 12:54:25

by Olaf Kirch

[permalink] [raw]
Subject: Re: [PATCH fs/locks 3 of 3] Fix race conditions with file lock vs close

On Mon, Jul 11, 2005 at 08:32:05AM -0400, Peter Staubach wrote:
> A different patch was submitted upstream already to address the fcntl/close
> race. It was accepted into the '-mm' kernel on June 28'th. That patch is
> attached.

I see, it seems this customer has been spamming just about everyone
about this problem :)

I just wonder how the following can work - your patch seems to drop
local book-keeping of locks completely, unless I misread it:


> + if (filp->f_op && filp->f_op->lock != NULL)
> error = filp->f_op->lock(filp, cmd, file_lock);
> + else {
> + for (;;) {
> + error = __posix_lock_file(inode, file_lock);
> + if ((error != -EAGAIN) || (cmd == F_SETLK))
> + break;
> + error = wait_event_interruptible(file_lock->fl_wait,
> + !file_lock->fl_next);
> + if (!error)
> + continue;
> +
> + locks_delete_block(file_lock);
> break;
> + }
> + }

This means if the application establishes an NFS lock, this will not
be reflected locally - and as a consequence, when the application just
exits, the lock will never be released.

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 13:04:51

by Peter Staubach

[permalink] [raw]
Subject: Re: [PATCH fs/locks 3 of 3] Fix race conditions with file lock vs close

Olaf Kirch wrote:

>On Mon, Jul 11, 2005 at 08:32:05AM -0400, Peter Staubach wrote:
>
>
>>A different patch was submitted upstream already to address the fcntl/close
>>race. It was accepted into the '-mm' kernel on June 28'th. That patch is
>>attached.
>>
>>
>
>I see, it seems this customer has been spamming just about everyone
>about this problem :)
>
>I just wonder how the following can work - your patch seems to drop
>local book-keeping of locks completely, unless I misread it:
>
>
>
>
>>+ if (filp->f_op && filp->f_op->lock != NULL)
>> error = filp->f_op->lock(filp, cmd, file_lock);
>>+ else {
>>+ for (;;) {
>>+ error = __posix_lock_file(inode, file_lock);
>>+ if ((error != -EAGAIN) || (cmd == F_SETLK))
>>+ break;
>>+ error = wait_event_interruptible(file_lock->fl_wait,
>>+ !file_lock->fl_next);
>>+ if (!error)
>>+ continue;
>>+
>>+ locks_delete_block(file_lock);
>> break;
>>+ }
>>+ }
>>
>>
>
>This means if the application establishes an NFS lock, this will not
>be reflected locally - and as a consequence, when the application just
>exits, the lock will never be released.
>

I didn't change the way that the code works in this area, I just changed the
indentation. As far as I can tell, the new code works the same way as the
old code did, except that it also doesn't leave dangling locks
around... :-)

I wrote a small program to reproduce the race that I thought that the 2.6
code was not already handling. It is attached. (Please note that it needs
to run with a large-ish delay value, something like 40 or 45 seconds. There
is some interaction with the Linux threading which I haven't had/made time
to investigate yet.) I can also attach the email that I sent upstream
which describes my analysis, if you like.

Thanx...

ps


Attachments:
repo.c (3.32 kB)

2005-07-11 13:18:32

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH fs/locks 3 of 3] Fix race conditions with file lock vs close

m=C3=A5 den 11.07.2005 Klokka 14:54 (+0200) skreiv Olaf Kirch:

> I just wonder how the following can work - your patch seems to drop
> local book-keeping of locks completely, unless I misread it:
>=20
>=20
> > + if (filp->f_op && filp->f_op->lock !=3D NULL)
> > error =3D filp->f_op->lock(filp, cmd, file_lock);
> > + else {
> > + for (;;) {
> > + error =3D __posix_lock_file(inode, file_lock);
> > + if ((error !=3D -EAGAIN) || (cmd =3D=3D F_SETLK))
> > + break;
> > + error =3D wait_event_interruptible(file_lock->fl_wait,
> > + !file_lock->fl_next);
> > + if (!error)
> > + continue;
> > +=09
> > + locks_delete_block(file_lock);
> > break;
> > + }
> > + }
>=20
> This means if the application establishes an NFS lock, this will not
> be reflected locally - and as a consequence, when the application just
> exits, the lock will never be released.

In recent kernels, a filesystem that overrides the ->lock() method
should provide its own lock bookkeeping. Usually that means calling
posix_lock_file but it doesn't have to.

Cheers,
Trond



-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-11 13:20:46

by Olaf Kirch

[permalink] [raw]
Subject: Re: [PATCH fs/locks 3 of 3] Fix race conditions with file lock vs close

On Mon, Jul 11, 2005 at 09:04:01AM -0400, Peter Staubach wrote:
> I didn't change the way that the code works in this area, I just changed the
> indentation. As far as I can tell, the new code works the same way as the
> old code did, except that it also doesn't leave dangling locks
> around... :-)

Doh, you're right. Time for a break :)

> I wrote a small program to reproduce the race that I thought that the 2.6
> code was not already handling. It is attached. (Please note that it needs
> to run with a large-ish delay value, something like 40 or 45 seconds. There
> is some interaction with the Linux threading which I haven't had/made time
> to investigate yet.) I can also attach the email that I sent upstream
> which describes my analysis, if you like.

Thanks, I'll give it a try.

While we're on the subject of swapping test cases, here's the one I've
been using.

Cheers
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


Attachments:
(No filename) (1.01 kB)
fusi-lock-crasher.c (2.73 kB)
Download all attachments