2002-04-02 16:50:30

by Brian Tinsley

[permalink] [raw]
Subject: EJUKEBOX and Java

While I'm starting to investigate, I just wanted to post my quandry here:


We have a Linux-based Java application (2.4.17 kernel, IBM JVM) that
accesses files over an NFS v3 mount (UDP) to a Solaris 8 NFS server that
exports a SAM-FS filesystem (Sun's HSM product). It seems that whenever
our application requests access to a file that resides on tape, it
encounters a temporary deadlock condition. We know that the NFS server
is returning EJUKEBOX at this point and it seems that once data begins
to flow back to the client, the deadlock releases. Could this be in any
way due to changing signal handling in the nfs3_rpc_wrapper function?
The Java virtual machine does use signals for internal purposes; I
believe libpthread does so as well. Any thoughts on this are more than
welcome.

--
Brian Tinsley
Senior Systems Engineer
Emageon




_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2002-04-03 00:42:13

by Brian Tinsley

[permalink] [raw]
Subject: Re: EJUKEBOX and Java

Understood, but when the EJUKEBOX error is encountered by our
application, the fact that RPC blocks signals (via rpc_clnt_sigmask)
during this sleep seems to cause every thread in the Java VM to
"deadlock" (we can even see the garbage collector stop) until data
starts to stream back from the NFS server. This is our current working
theory anyway.


Kent, Ian I. wrote:

>The NFSv3 RFC specifies the client behaviour you are seeing for this server return.
>
>It's not a deadlock, the client sleeps, then retries.
>



_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-04-03 21:40:04

by Lever, Charles

[permalink] [raw]
Subject: RE: EJUKEBOX and Java

this could appear because your Java threads are emulated
entirely in user space (green threads?). if one of the
threads goes into the kernel via a system call and doesn't
return, then your entire JVM is hung.

just a thought.

> -----Original Message-----
> From: Brian Tinsley [mailto:[email protected]]
> Sent: Tuesday, April 02, 2002 7:39 PM
> To: Kent, Ian I.
> Cc: [email protected]
> Subject: Re: [NFS] EJUKEBOX and Java
>
>
> Understood, but when the EJUKEBOX error is encountered by our
> application, the fact that RPC blocks signals (via rpc_clnt_sigmask)
> during this sleep seems to cause every thread in the Java VM to
> "deadlock" (we can even see the garbage collector stop) until data
> starts to stream back from the NFS server. This is our
> current working
> theory anyway.
>
>
> Kent, Ian I. wrote:
>
> >The NFSv3 RFC specifies the client behaviour you are seeing
> for this server return.
> >
> >It's not a deadlock, the client sleeps, then retries.
> >
>
>
>
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-04-03 21:50:16

by Brian Tinsley

[permalink] [raw]
Subject: Re: EJUKEBOX and Java

Thanks, but that's not the case. The IBM JVM uses native threads as it
should. We have recently discovered, however, that they apparently have
signal handling problems that are leading to this deadlock. Our
developers were able to write a small app that consistently reproduces
the deadlock on both their 1.3.0 and 1.3.1 releases. We have not been
able to reproduce the problem with Sun's JVM, thus the finger points to
to the IBM JVM.


Lever, Charles wrote:

>this could appear because your Java threads are emulated
>entirely in user space (green threads?). if one of the
>threads goes into the kernel via a system call and doesn't
>return, then your entire JVM is hung.
>
>just a thought.
>
>>-----Original Message-----
>>From: Brian Tinsley [mailto:[email protected]]
>>Sent: Tuesday, April 02, 2002 7:39 PM
>>To: Kent, Ian I.
>>Cc: [email protected]
>>Subject: Re: [NFS] EJUKEBOX and Java
>>
>>
>>Understood, but when the EJUKEBOX error is encountered by our
>>application, the fact that RPC blocks signals (via rpc_clnt_sigmask)
>>during this sleep seems to cause every thread in the Java VM to
>>"deadlock" (we can even see the garbage collector stop) until data
>>starts to stream back from the NFS server. This is our
>>current working
>>theory anyway.
>>
>>
>>Kent, Ian I. wrote:
>>
>>>The NFSv3 RFC specifies the client behaviour you are seeing
>>>
>>for this server return.
>>
>>>It's not a deadlock, the client sleeps, then retries.
>>>
>>
>>
>>_______________________________________________
>>NFS maillist - [email protected]
>>https://lists.sourceforge.net/lists/listinfo/nfs
>>

--
Brian Tinsley
Senior Systems Engineer
Emageon





_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs