2003-12-06 19:50:58

by Philippe Troin

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

Kenny Simpson <[email protected]> writes:

> I have a up-to-date RH9 (kernel 2.4.20-24.9, nfs utils 1.0.1-3.9) I'm using as
> an NFS client (the server is a NetApp), and I'm trying to use advisory locking
> of a file.
> If the process that locks the file exits normally, the lock is released, and
> everything is fine.
> However, if the process aborts, the lock is left with no clear way to remove
> it. I must remove the file to get rid of the lock.
>
> Details:
> Here is a test case:
> int main()
> {
> int fd = open("file", O_RDWR);
> if (lockf( fd, F_TLOCK, 0 ) < 0)
> .... print error message query owner
> pause();
> close( fd );
> }
>
> If I run this, when it gets to the pause(), I can clearly see in /proc/locks
> the process owning the lock.
> If I then kill -ABRT <pid>, the entry in /proc/locks goes away, but the lock is
> not removed from the server.
> When I run the program a second time, the lock acquire failes, and it says the
> (now defunct) old process still owns the lock. Since I cannot easily make
> another process with the id of the original, I seem to have no way to
> explicitly release the lock.

I've also seen this behavior witht the stock 2.4.22 and 2.4.23
kernels.

See the thread:

http://sourceforge.net/mailarchive/forum.php?thread_id=3213117&forum_id=4930


> I even ran ethereal to watch which NLM requests were being made. No unlock
> request was ever sent, so I don't think this can be a server issue.
>
> Any ideas? Is it supposed to work this way?

No and no.

Phil.


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-12-08 03:39:33

by Kenny Simpson

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

--- Philippe Troin <[email protected]> wrote:
>
> I've also seen this behavior witht the stock 2.4.22 and 2.4.23
> kernels.
>
> See the thread:
>
> http://sourceforge.net/mailarchive/forum.php?thread_id=3213117&forum_id=4930
>

So, this patch has not found its way into any kernel yet?
Is there anyone actively persuing this bug?

-Kenny


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/

2003-12-08 05:16:20

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

>>>>> " " == Kenny Simpson <[email protected]> writes:

> So, this patch has not found its way into any kernel yet? Is
> there anyone actively persuing this bug?

Feel free. There are only so many hours in a day, and right now
mine are pretty much overbooked with NFSv4 stuff...

Cheers,
Trond

2003-12-08 17:32:05

by Philippe Troin

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

Trond Myklebust <[email protected]> writes:

> >>>>> " " == Kenny Simpson <[email protected]> writes:
>
> > So, this patch has not found its way into any kernel yet? Is
> > there anyone actively persuing this bug?
>
> Feel free. There are only so many hours in a day, and right now
> mine are pretty much overbooked with NFSv4 stuff...

Please note that this fix only mitigates the bug: it can still occur,
but much less frequently. Before this patch, nfsd would loose track of
the lock (see the enclosed program at the beginning of the thread)
after a few (<5) kills. With the patch, it takes sometimes as many as
300~500 kills before the bugs manifests itself.

Trond, do you think I should push the patch to Marcelo, or should I
wait for a better fix? I don't think Marcelo would accept a partial
fix. I would try to fix it myself, but I have no clue on the inner
workings of lockd/nfsd.

Phil.

2003-12-08 19:56:07

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS client] NFS locks not released on abnormal process termination

P? m? , 08/12/2003 klokka 12:32, skreiv Philippe Troin:
> Trond, do you think I should push the patch to Marcelo, or should I
> wait for a better fix?

No. If I wanted a partial fix, I could just as well have pushed it to
Marcelo myself. When I said "feel free" I was referring to pursuing the
remaining signalling bugs.

I have a feeling the second race case of your test is that you are
interrupting the fcntl(F_SETLK) call while it is on the wire. If you do
that, then the server may record the lock as taken, but the client will
never receive the reply, and so will not know that it must clean up
locks...
Hmm... For that case, we probably want to have the locking code record
the call as having succeeded in order to ensure that we do indeed clear
out the lock on process exit. See if the appended patch helps...

Cheers,
Trond


Attachments:
linux-2.4.23-lock_race.dif (4.12 kB)