Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Thu, 22 Mar 2001 12:14:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Thu, 22 Mar 2001 12:14:28 -0500 Received: from enhanced.ppp.eticomm.net ([206.228.183.5]:40431 "EHLO intech19.enhanced.com") by vger.kernel.org with ESMTP id ; Thu, 22 Mar 2001 12:14:18 -0500 From: Camm Maguire To: linux-kernel@vger.kernel.org cc: nfs-devel@linux.kernel.org Subject: PROBLEM: 2.2.18 oops leaves umount hung in disk sleep User-Agent: WEMI/1.13.7 (Shimada) FLIM/1.13.2 (Kasanui) Emacs/20.7 (i386-debian-linux-gnu) MULE/4.0 (HANANOEN) MIME-Version: 1.0 (generated by WEMI 1.13.7 - "Shimada") Content-Type: text/plain; charset=US-ASCII Message-Id: Date: Thu, 22 Mar 2001 12:13:29 -0500 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Please reply directly as I'm in ECN exile from this mailing list! [1.] One line summary of the problem: 2.2.18 oops leaves umount hung in disk sleep [2.] Full description of the problem/report: Greetings! We have a backup server running 2.2.18, nfs-kernel-server 0.1.9.1-1 (Debian), mount 2.10f-5.1, autofs 3.1.4-9. Two nights ago, the kernel had an oops when autofs tried to umount an nfs-mounted filesystem, leaving the umount process in an uninterruptible state. 15655 ? D 0:00 /bin/umount /mnt/i19 [3.] Keywords (i.e., modules, networking, kernel): nfs, autofs, mount [4.] Kernel version (from /proc/version): intech9# cat /proc/version Linux version 2.2.18-i586tsc (root@intech19) (gcc version 2.95.2 20000220 (Debian GNU/Linux)) #1 Tue Feb 13 14:31:34 EST 2001 [5.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) intech9# cat oo.txt Unable to handle kernel NULL pointer dereference at virtual address 00000000 current->tss.cr3 = 02872000, %%cr3 = 02872000 *pde = 00000000 Oops: 0000 CPU: 0 intech9# ksymoops tss.cr3 = 02872000, %%cr3 = 02872000 *pde = 00000000 Oops: 0000 CPU: 0 2 warnings issued. Results may not be reliable. [6.] A small shell script or example program which triggers the problem (if possible) NA [7.] Environment [7.1.] Software (add the output of the ver_linux script here) intech9# sh ./ver_linux sh ./ver_linux -- Versions installed: (if some fields are empty or looks -- unusual then possibly you have very old versions) Linux intech9 2.2.18-i586tsc #1 Tue Feb 13 14:31:34 EST 2001 i586 unknown Kernel modules 2.4.1 Gnu C 2.95.2 Binutils 2.9.5.0.37 Linux C Library 2.1.3 Dynamic linker ldd: version 1.9.11 Procps 2.0.6 Mount 2.10f Net-tools 2.05 Console-tools 0.2.3 Sh-utils 2.0 Modules Loaded nls_cp437 smbfs st ide-scsi scsi_mod nfsd parport_probe parport_pc lp parport nfs autofs lockd sunrpc ne2k-pci 8390 serial ide-disk ide-probe ide-mod [7.2.] Processor information (from /proc/cpuinfo): intech9# cat /proc/cpuinfo cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 5 model : 8 model name : AMD-K6(tm) 3D processor stepping : 12 cpu MHz : 451.019 cache size : 64 KB fdiv_bug : no hlt_bug : no sep_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mmx 3dnow bogomips : 901.12 [7.3.] Module information (from /proc/modules): intech9# cat /proc/modules cat /proc/modules nls_cp437 3920 0 (autoclean) smbfs 26144 0 (autoclean) st 24160 0 (autoclean) ide-scsi 7424 0 (autoclean) scsi_mod 38160 2 (autoclean) [st ide-scsi] nfsd 161952 8 (autoclean) parport_probe 3408 0 (autoclean) parport_pc 7408 1 (autoclean) lp 5296 0 (autoclean) (unused) parport 7584 1 (autoclean) [parport_probe parport_pc lp] nfs 44480 0 (autoclean) autofs 9136 4 (autoclean) lockd 43312 1 (autoclean) [nfsd nfs] sunrpc 58768 1 (autoclean) [nfsd nfs lockd] ne2k-pci 4176 1 (autoclean) 8390 6144 0 (autoclean) [ne2k-pci] serial 18016 0 (autoclean) ide-disk 5736 8 (autoclean) ide-probe 6116 0 (autoclean) ide-mod 43384 8 (autoclean) [ide-scsi ide-disk ide-probe] [7.4.] SCSI information (from /proc/scsi/scsi) intech9# cat /proc/scsi/scsi cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: CONNER Model: CTT8000-A Rev: 1.17 Type: Sequential-Access ANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: HP Model: COLORADO 14GB Rev: 4.00 Type: Sequential-Access ANSI SCSI revision: 02 intech9# [7.5.] Other information that might be relevant to the problem (please look in /proc and include all information that you think to be relevant): intech9# cat /proc/15655/status cat /proc/15655/status Name: umount State: D (disk sleep) Pid: 15655 PPid: 305 Uid: 0 0 0 0 Gid: 0 0 0 0 Groups: VmSize: 1024 kB VmLck: 0 kB VmRSS: 408 kB VmData: 32 kB VmStk: 8 kB VmExe: 36 kB VmLib: 924 kB SigPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 8000000000000000 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: 00000000fffffeff CapEff: 00000000fffffeff Here are the syslog excepts: Mar 21 01:09:01 intech9 automount[15648]: running expiration on path /mnt Mar 21 01:09:01 intech9 automount[15648]: expired /mnt/local Mar 21 01:09:01 intech9 automount[15648]: expired /mnt/i19d Mar 21 01:09:47 intech9 automount[305]: >> umount: /mnt/i19: device is busy Mar 21 01:09:47 intech9 automount[305]: using kernel protocol version 3 on reawaken Mar 21 01:11:03 intech9 automount[305]: >> umount: /mnt/i19: device is busy Mar 21 01:11:03 intech9 automount[305]: using kernel protocol version 3 on reawaken Mar 21 01:12:18 intech9 automount[305]: >> umount: /mnt/i19: device is busy Mar 21 01:12:18 intech9 automount[305]: using kernel protocol version 3 on reawaken Mar 21 01:13:33 intech9 automount[305]: >> umount: /mnt/i19: device is busy Mar 21 01:13:33 intech9 automount[305]: using kernel protocol version 3 on reawaken Mar 21 01:14:01 intech9 automount[15653]: running expiration on path /mnt Mar 21 01:14:01 intech9 automount[15653]: expired /mnt/local Mar 21 01:14:01 intech9 automount[15653]: expired /mnt/i19d Mar 21 01:14:49 intech9 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Mar 21 01:14:49 intech9 kernel: current->tss.cr3 = 02872000, %%cr3 = 02872000 Mar 21 01:14:49 intech9 kernel: *pde = 00000000 Mar 21 01:14:49 intech9 kernel: Oops: 0000 Mar 21 01:14:49 intech9 kernel: CPU: 0 Mar 21 01:14:49 intech9 automount[305]: using kernel protocol version 3 on reawaken The nfs host i19 does not show the client as having an active mount at present. [X.] Other notes, patches, fixes, workarounds: Andrea's vmglobal patch applied to kernel. PIII SSE enabling patch also applied. -- Camm Maguire camm@enhanced.com ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/