2003-12-09 08:15:36

by Philippe Troin

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

Trond Myklebust <[email protected]> writes:

> P? m? , 08/12/2003 klokka 12:32, skreiv Philippe Troin:
> > Trond, do you think I should push the patch to Marcelo, or should I
> > wait for a better fix?
>
> No. If I wanted a partial fix, I could just as well have pushed it to
> Marcelo myself. When I said "feel free" I was referring to pursuing the
> remaining signalling bugs.

Ok... Although I willing to "feel free", I might not have enough time
and ability...

> I have a feeling the second race case of your test is that you are
> interrupting the fcntl(F_SETLK) call while it is on the wire. If you do
> that, then the server may record the lock as taken, but the client will
> never receive the reply, and so will not know that it must clean up
> locks...
>
> Hmm... For that case, we probably want to have the locking code record
> the call as having succeeded in order to ensure that we do indeed clear
> out the lock on process exit. See if the appended patch helps...

Thanks Trond.

>From my reading of the patch, it supersedes the old patch, and is only
necessary on the client. Is also does not compile :-)
Here's an updated patch which does compile.


Attachments:
linux-2.4.23-nfs-lock-race-2.patch (4.04 kB)
(No filename) (212.00 B)
Download all attachments

2003-12-09 08:42:18

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

>>>>> " " == Philippe Troin <[email protected]> writes:

> From my reading of the patch, it supersedes the old patch, and
> is only
> necessary on the client. Is also does not compile :-)

Yeah, I admit I didn't test it out...

> Here's an updated patch which does compile.

Thanks.

> I am still running tests, but so far it looks good (that is all
> locks are freed when a process with locks running on a NFS
> client is killed).

Good...

There are still 2 other issues with the generic POSIX locking code.
Both issues have to do with CLONE_VM and have been raised on
linux-kernel & linux-fsdevel. Unfortunately they met with no response,
so I'm unable to pursue...

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-09 18:54:18

by Chris Croswhite

[permalink] [raw]
Subject: RE: Re: [NFS client] NFS locks not released on abnormal process termination

Philippe,

What patches are you refering to?

TIA,
Chris


-----Original Message-----
From: Philippe Troin [mailto:[email protected]]
Sent: Tue 09-Dec-03 10:46
To: Trond Myklebust
Cc: Kenny Simpson; [email protected]; =
[email protected]
Subject: [NFS] Re: [NFS client] NFS locks not released on abnormal =
process termination
Trond Myklebust <[email protected]> writes:

> >>>>> " " =3D=3D Philippe Troin <[email protected]> writes:
>=20
> > From my reading of the patch, it supersedes the old patch, and
> > is only
> > necessary on the client. Is also does not compile :-)
>=20
> Yeah, I admit I didn't test it out...
>=20
> > Here's an updated patch which does compile.
>=20
> Thanks.
>=20
> > I am still running tests, but so far it looks good (that is all
> > locks are freed when a process with locks running on a NFS
> > client is killed).
>=20
> Good...

I've ran test overnight on four boxen, and no locks were lost.
I guess you can send this patch to Marcello now.

I've tested with the enclosed program.

=20
> There are still 2 other issues with the generic POSIX locking code.
> Both issues have to do with CLONE_VM and have been raised on
> linux-kernel & linux-fsdevel. Unfortunately they met with no response,
> so I'm unable to pursue...

Can we help? Pointers?

Phil.





-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-09 19:28:42

by Philippe Troin

[permalink] [raw]
Subject: Re: Re: [NFS client] NFS locks not released on abnormal process termination

"Chris Croswhite" <[email protected]> writes:

> Philippe,
>
> What patches are you refering to?

The one in <[email protected]> named
linux-2.4.23-nfs-lock-race-2.patch

Here a link to MARC, since the sourceforge mailing list web page
sucks:

http://marc.theaimsgroup.com/?l=linux-nfs&m=107095817723325&w=2

Phil.

> -----Original Message-----
> From: Philippe Troin [mailto:[email protected]]
> Sent: Tue 09-Dec-03 10:46
> To: Trond Myklebust
> Cc: Kenny Simpson; [email protected]; [email protected]
> Subject: [NFS] Re: [NFS client] NFS locks not released on abnormal process termination
> Trond Myklebust <[email protected]> writes:
>
> > >>>>> " " == Philippe Troin <[email protected]> writes:
> >
> > > From my reading of the patch, it supersedes the old patch, and
> > > is only
> > > necessary on the client. Is also does not compile :-)
> >
> > Yeah, I admit I didn't test it out...
> >
> > > Here's an updated patch which does compile.
> >
> > Thanks.
> >
> > > I am still running tests, but so far it looks good (that is all
> > > locks are freed when a process with locks running on a NFS
> > > client is killed).
> >
> > Good...
>
> I've ran test overnight on four boxen, and no locks were lost.
> I guess you can send this patch to Marcello now.
>
> I've tested with the enclosed program.
>
>
> > There are still 2 other issues with the generic POSIX locking code.
> > Both issues have to do with CLONE_VM and have been raised on
> > linux-kernel & linux-fsdevel. Unfortunately they met with no response,
> > so I'm unable to pursue...
>
> Can we help? Pointers?
>
> Phil.
>
>
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: SF.net Giveback Program.
> Does SourceForge.net help you be more productive? Does it
> help you create better code? SHARE THE LOVE, and help us help
> YOU! Click Here: http://sourceforge.net/donate/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-09 20:18:24

by Chris Croswhite

[permalink] [raw]
Subject: RE: Re: [NFS client] NFS locks not released on abnormal process termination


Thanks!

-----Original Message-----
From: Philippe Troin [mailto:[email protected]]
Sent: Tue 09-Dec-03 11:26
To: Chris Croswhite
Cc: [email protected]; [email protected]
Subject: Re: [NFS] Re: [NFS client] NFS locks not released on abnormal =
process termination
"Chris Croswhite" <[email protected]> writes:

> Philippe,
>=20
> What patches are you refering to?

The one in <[email protected]> named
linux-2.4.23-nfs-lock-race-2.patch=20

Here a link to MARC, since the sourceforge mailing list web page
sucks:

http://marc.theaimsgroup.com/?l=3Dlinux-nfs&m=3D107095817723325&w=3D2

Phil.

> -----Original Message-----
> From: Philippe Troin [mailto:[email protected]]
> Sent: Tue 09-Dec-03 10:46
> To: Trond Myklebust
> Cc: Kenny Simpson; [email protected]; =
[email protected]
> Subject: [NFS] Re: [NFS client] NFS locks not released on abnormal =
process termination
> Trond Myklebust <[email protected]> writes:
>=20
> > >>>>> " " =3D=3D Philippe Troin <[email protected]> writes:
> >=20
> > > From my reading of the patch, it supersedes the old patch, =
and
> > > is only
> > > necessary on the client. Is also does not compile :-)
> >=20
> > Yeah, I admit I didn't test it out...
> >=20
> > > Here's an updated patch which does compile.
> >=20
> > Thanks.
> >=20
> > > I am still running tests, but so far it looks good (that is =
all
> > > locks are freed when a process with locks running on a NFS
> > > client is killed).
> >=20
> > Good...
>=20
> I've ran test overnight on four boxen, and no locks were lost.
> I guess you can send this patch to Marcello now.
>=20
> I've tested with the enclosed program.
>=20
> =20
> > There are still 2 other issues with the generic POSIX locking code.
> > Both issues have to do with CLONE_VM and have been raised on
> > linux-kernel & linux-fsdevel. Unfortunately they met with no =
response,
> > so I'm unable to pursue...
>=20
> Can we help? Pointers?
>=20
> Phil.
>=20
>=20
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.net email is sponsored by: SF.net Giveback Program.
> Does SourceForge.net help you be more productive? Does it
> help you create better code? SHARE THE LOVE, and help us help
> YOU! Click Here: http://sourceforge.net/donate/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs





-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-10 02:42:58

by Kenny Simpson

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

> I've ran test overnight on four boxen, and no locks were lost.
> I guess you can send this patch to Marcello now.
>
Excellent work!

> > There are still 2 other issues with the generic POSIX locking code.
> > Both issues have to do with CLONE_VM and have been raised on
> > linux-kernel & linux-fsdevel. Unfortunately they met with no response,
> > so I'm unable to pursue...
>
> Can we help? Pointers?
Let me know if/how I can help.

Again, great work.

thanks,
-Kenny

__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/

2003-12-15 01:04:21

by Kenny Simpson

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

Any hope of getting this patch into 2.6?

-Kenny


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/

2003-12-15 01:14:29

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

>>>>> " " == Kenny Simpson <[email protected]> writes:

> Any hope of getting this patch into 2.6? -Kenny

2.6 is in full freeze at the moment. Later perhaps...

Cheers,
Trond




-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-08 10:47:44

by YAMAMOTO Takashi

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

hi,

> + status = nlmclnt_proc(inode, cmd, fl);
> + /* If we were signalled we still need to ensure that
> + * we clean up any state on the server. We therefore
> + * record the lock call as having succeeded in order to
> + * ensure that locks_remove_posix() cleans it out when
> + * the process exits.
> + */
> + if (status == -EINTR || status == -ERESTARTSYS)
> + posix_lock_file(filp, fl, 0);
> + unlock_kernel();
> + if (status < 0)
> + return status;

i think it's problematic because you can't assume the lock was
granted on the server and the signaled process might not exit immediately.

YAMAMOTO Takashi



-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-08 16:51:12

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

>
> i think it's problematic because you can't assume the lock was
> granted on the server and the signaled process might not exit
> immediately.

The point is that it is *worse* to assume the lock was not granted,
since then it will never get cleared on the server.

The RPC layer blocks all signals except SIGKILL, so the signalled
process has no choice but to exit immediately if something gets
through.

Cheers,
Trond




-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-09 02:56:38

by YAMAMOTO Takashi

[permalink] [raw]
Subject: Re: Re: [NFS client] NFS locks not released on abnormal process termination

hi,

> > i think it's problematic because you can't assume the lock was
> > granted on the server and the signaled process might not exit
> > immediately.
>
> The point is that it is *worse* to assume the lock was not granted,
> since then it will never get cleared on the server.

yes.

> The RPC layer blocks all signals except SIGKILL, so the signalled
> process has no choice but to exit immediately if something gets
> through.

we're talking about interruptible mounts, aren't we?

are you referring to rpc_clnt_sigmask() ?
i think it isn't safe to assume sa_handler isn't changed during
blocking for lock. consider CLONE_SIGHAND, for example.

YAMAMOTO Takashi



-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-09 03:40:55

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS] Re: [NFS client] NFS locks not released on abnormal process termination

>> The RPC layer blocks all signals except SIGKILL, so the signalled
>> process has no choice but to exit immediately if something gets
>> through.
>
> we're talking about interruptible mounts, aren't we?
>
> are you referring to rpc_clnt_sigmask() ?
> i think it isn't safe to assume sa_handler isn't changed during
> blocking for lock. consider CLONE_SIGHAND, for example.

So what? If you decide handle a signal, then you are taking full
responsibility for the recovery process. It is up to _you_ to take action
to either recover the lock or to undo it, not the kernel. To determine
whether or not the lock was taken on the server you can just do a
fcntl(GETLK) call.

All the kernel cares about is that when the process exits, it needs to
clean up all the locks that are owned by that pid.

Cheers,
Trond