From: "Yogesh Pahilwan" Subject: Oops in rpciod when building procmail from an NFS client Date: Wed, 14 Nov 2007 11:59:15 +0530 Message-ID: <83FE585A3944624683D9A3F69BA108EB0224B633@PUN2K3EXCL01.symphonysv.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1069728705==" To: Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1IsBkk-0003Ba-HO for nfs@lists.sourceforge.net; Tue, 13 Nov 2007 22:29:34 -0800 Received: from pun2k3exfe01.symphonysv.com ([220.225.41.101]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1IsBko-0007Lz-JG for nfs@lists.sourceforge.net; Tue, 13 Nov 2007 22:29:40 -0800 List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net This is a multi-part message in MIME format. --===============1069728705== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C82687.AD7E8D0C" This is a multi-part message in MIME format. ------_=_NextPart_001_01C82687.AD7E8D0C Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Hi Folks, =20 I am getting following oops while building procmail utility from the NFS mounted share when build from the NFS client. The NFS server oopses in rpciod during the "Benchmarking your system's strstr () implementation" step. =20 I am using the following configurations for NFS client and server: NFS client configuration: kernel version - 2.6.11-1.1369_FC4 NFS server configuration: kernel version - 2.4.19 Procmail version: v3.22 =20 When I encountered the following oops I didn't kill the build process on the client, and when I rebooted the server the build progressed a little bit beyond where it was when the server crashed, and then the NFS server oopsed again. I tried to kill the process again on the client (via ctrl-c, ctrl-z), since it was unresponsive due to running off a mount to the server while the server was unavailable; so instead I just killed the console in the hopes that would kill the process tree including the one stuck on the server mount. I rebooted the server again, and within a minute or so it crashed yet again, seemingly spontaneously; and this occurred again after another reboot. I checked the processes on the client and found several stuck "_locktst /tmp/_locktest ./_locktest" processes. These processes were in clusters of about 6 processes with contiguous PIDs, and there were about the same number of these clusters as the number of times I'd run the procmail make and oopsed the server. Once I killed all these _locktst processes and rebooted the server again, the oopsing stopped. =20 The oops: =20 Oops: 0000 CPU: 0 EIP: 0010:[<40313474>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: 00000000 ebx: 4e38dd20 ecx: 00000000 edx: 00000004 esi: 4e38c000 edi: 4e38dd20 ebp: 00000001 esp: 4fc41f20 ds: 0018 es: 0018 ss: 0018 Process rpciod (pid: 28706, stackpage=3D4fc41000) Stack: 00000216 5408ac94 4e38c000 5a8a4720 4e38c000 5a8a4720 4031552d 4e38c000 4e38dd20 40313967 4e38c000 4e38dd20 40313c70 00000000 00000000 4e38dd74 4e38dd20 4fc40000 00000001 403165ca 4e38dd20 403aa000 4fc40000 410e09a0 Call Trace: [<4031552d>] [<40313967>] [<40313c70>] [<403165ca>] [<40113dac>] [<403168cd>] [<403170b1>] [<40317010>] [<4010757e>] [<40317010>] Code: 8b 80 88 00 00 00 85 c0 74 39 c7 44 24 0c 00 00 00 00 8d 46 Error (pclose_local): Oops_decode pclose failed 0x7f00 Error (Oops_decode): no objdump lines read for /tmp/ksymoops.UQoI1x =20 >>EIP; 40313474 <__xprt_lock_write+54/f0> <=3D=3D=3D=3D=3D =20 >>ebx; 4e38dd20 <_end+df5ef5c/203d329c> >>esi; 4e38c000 <_end+df5d23c/203d329c> >>edi; 4e38dd20 <_end+df5ef5c/203d329c> >>esp; 4fc41f20 <_end+f81315c/203d329c> =20 Trace; 4031552d Trace; 40313967 Trace; 40313c70 Trace; 403165ca <__rpc_execute+10a/2f0> Trace; 40113dac Trace; 403168cd <__rpc_schedule+8d/120> Trace; 403170b1 Trace; 40317010 Trace; 4010757e Trace; 40317010 =20 There is also a kernel panic that follows this. =20 Is there any patch available which fixes this nfs issue??? I would appreciate if anyone tells me=20 where should I get the patch (if available) to resolve this issue.=20 =20 =20 Regards, Yogesh =20 =20 =20 ------_=_NextPart_001_01C82687.AD7E8D0C Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable

Hi Folks,

 

I am getting following oops while building procmail = utility from the NFS mounted share when build from the NFS = client.

The NFS server oopses in rpciod during the "Benchmarking your system's strstr () implementation" = step.

 

I am using the following configurations for NFS = client and server:

NFS client configuration: kernel version - = 2.6.11-1.1369_FC4

NFS server configuration: kernel version – = 2.4.19

Procmail version: v3.22

 

When I encountered the following oops I didn't kill = the build process on the client, and when I rebooted the server the build progressed a little bit beyond where it was when the server crashed, and = then the NFS server oopsed again. I tried to kill the process again on the = client (via ctrl-c, ctrl-z), since it was unresponsive due to running off a = mount to the server while the server was unavailable; so instead I just killed = the console in the hopes that would kill the process tree including the one = stuck on the server mount. I rebooted the server again, and within a minute or = so it crashed yet again, seemingly spontaneously; and this occurred again = after another reboot. I checked the processes on the client and found several = stuck "_locktst /tmp/_locktest ./_locktest" processes. These = processes were in clusters of about 6 processes with contiguous PIDs, and there were = about the same number of these clusters as the number of times I'd run the = procmail make and oopsed the server. Once I killed all these _locktst processes and = rebooted the server again, the oopsing stopped.

 

The = oops:

 

Oops: = 0000

CPU:    = 0

EIP:    0010:[<40313474>]    Not = tainted

Using defaults from ksymoops = -t elf32-i386 -a i386

EFLAGS: = 00010246

eax: 00000000   = ebx: 4e38dd20   ecx: 00000000   edx: = 00000004

esi: 4e38c000   = edi: 4e38dd20   ebp: 00000001   esp: 4fc41f20

ds: 0018   es: 0018   ss: 0018

Process rpciod (pid: 28706, stackpage=3D4fc41000)

Stack: 00000216 5408ac94 = 4e38c000 5a8a4720 4e38c000 5a8a4720 4031552d = 4e38c000

     = ;  4e38dd20 40313967 4e38c000 4e38dd20 40313c70 00000000 00000000 = 4e38dd74

     = ;  4e38dd20 4fc40000 00000001 403165ca 4e38dd20 403aa000 4fc40000 = 410e09a0

Call = Trace:    [<4031552d>] [<40313967>] [<40313c70>] = [<403165ca>] [<40113dac>]

  [<403168cd>] [<403170b1>] [<40317010>] [<4010757e>] = [<40317010>]

Code: 8b 80 88 00 00 00 85 = c0 74 39 c7 44 24 0c 00 00 00 00 8d 46

Error (pclose_local): = Oops_decode pclose failed 0x7f00

Error (Oops_decode): no = objdump lines read for /tmp/ksymoops.UQoI1x

 

>>EIP; 40313474 <__xprt_lock_write+54/f0>   = <=3D=3D=3D=3D=3D

 

>>ebx; 4e38dd20 = <_end+df5ef5c/203d329c>

>>esi; 4e38c000 <_end+df5d23c/203d329c>

>>edi; 4e38dd20 <_end+df5ef5c/203d329c>

>>esp; 4fc41f20 <_end+f81315c/203d329c>

 

Trace; 4031552d <xprt_lock_write+1d/50>

Trace; 40313967 <xprt_reconnect+a7/3b0>

Trace; 40313c70 <xprt_reconn_status+0/80>

Trace; 403165ca <__rpc_execute+10a/2f0>

Trace; 40113dac <schedule+20c/350>

Trace; 403168cd <__rpc_schedule+8d/120>

Trace; 403170b1 = <rpciod+a1/220>

Trace; 40317010 = <rpciod+0/220>

Trace; 4010757e <kernel_thread+2e/40>

Trace; 40317010 = <rpciod+0/220>

 

There is also a kernel panic that follows = this.

 

Is there any patch available which fixes this nfs = issue??? I would appreciate if anyone tells me

where should I get the patch (if available) to resolve = this issue.

 

 

Regards,

Yogesh

 

 

 

------_=_NextPart_001_01C82687.AD7E8D0C-- --===============1069728705== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ --===============1069728705== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs --===============1069728705==--