Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Thu, 29 Aug 2002 14:22:12 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Thu, 29 Aug 2002 14:22:12 -0400 Received: from tartu.cyber.ee ([193.40.6.68]:65296 "EHLO tartu.cyber.ee") by vger.kernel.org with ESMTP id ; Thu, 29 Aug 2002 14:22:09 -0400 Date: Thu, 29 Aug 2002 21:26:31 +0300 (EEST) From: Meelis Roos To: linux-kernel@vger.kernel.org Subject: Hangs in 2.4.19 and 2.4.20-pre5 (IDE-related?) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 15027 Lines: 316 I have an old computer (from 1997), K6/200 with 430TX chipset. It has served me well so far. It ran 2.4.18 OK since February. But 2.4.19 and 2.4.20-pre5 both hang frequently. 2.2.15 works well but it looks to be running in PIO mode. 2.4.18 ran in udma33 mode but maybe the computer is broken now. The hangs appear when I'm doing disk-intensive work (compiling kernel, bk clone, bk co, ...). Sometimes disk IO just hangs at some point and everything depending on disk gets into D state. Usually I was in X and didn't see any details but later I was at the console and got some information. vmstat 1 was running at one crash and reported that I have 17M swap used (64M total) and 6M free memory so these are not memory shortages that I suspected first. I also got Sysrq-P and Sysrq-T info. Sysrq-P showed it looping in addresses 00088XXX (or was it 08088XXX). System.map didn't have these addresses :( Sysrq-T showed that many tasks were in D state but EIP was usually either 00000000 or 7FFFFFFF (also on some sleeping tasks). Is this normal? I installed 2.4.20-pre5 (after several iterations of compile+boot I got it compiled since both of my kernels were too new). Still the same. I do my heavy work on hdd which is an old Seagate 2.5G (non-udma but dma mode 2). This disk is even older than the computer, it was a new model when bought. It shows some disturbing pre-errors with smart but no errors in kernel log. The hang happened 2-3 times just after 'sync' command done after some big operations like bk clone or bk co. Tried both ext2 and ext3 filesysyems on the data partition, root is ext2. The hangs happen in both cases. The HDD LED is (almost) always on during the hangs. After Sysrq-B the computer sometimes just hangs with HDD LED on, sometimes boots. This symptom is old on this computer, it ocassionaly hangs on reboot with HDD LED on, both when booting from linux and rebooting from win95 that I had a long time ago. When pressing Sysrq-S during the hang, it gets to Syncing 03:03 (hda3, my root partition) and hangs there. So maybe it's not the old Seagate disk that is at fault. hda and hdb are almost identical 1.6G Quantums. I'm compiling 2.4.18 now to see whether I can reproduce the problem with it (my old 2.4.18 binary doesn't work any more since it falsely detects that hda has Acorn PowerTec partition table - something changed inside hda data; 2.4.19 without Acorn partition table support and 2.2.15 find the partitions OK). smartctl tells some high error rates about hdd; hda and hdb are normal. hdc is cdrom used via ide-scsi. smartctl, atapci, hdparm, lspci, dmesg output are below. Vendor Specific SMART Attributes with Thresholds: Revision Number: 5 Attribute Flag Value Worst Threshold Raw Value ( 1)Raw Read Error Rate 0x000a 114 099 000 72998123 ( 3)Spin Up Time 0x0006 097 097 000 3 ( 4)Start Stop Count 0x0013 100 100 020 107 ( 5)Reallocated Sector Ct 0x0013 100 100 036 0 ( 7)Seek Error Rate 0x000b 065 053 030 24096359 ( 10)Spin Retry Count 0x0013 100 100 097 0 ( 12)Power Cycle Count 0x0013 100 100 020 102 pcibus = 33333 00:07.1 vendor=8086 device=7111 class=0101 irq=0 base4=f001 ----------PIIX BusMastering IDE Configuration--------------- Driver Version: 1.3 South Bridge: 28945 Revision: IDE 0x1 Highest DMA rate: UDMA33 BM-DMA base: 0xf000 PCI clock: 33.3MHz -----------------------Primary IDE-------Secondary IDE------ Enabled: yes yes Simplex only: no no Cable Type: 40w 40w -------------------drive0----drive1----drive2----drive3----- Prefetch+Post: yes yes yes yes Transfer Mode: PIO PIO PIO PIO Address Setup: 90ns 90ns 90ns 90ns Cmd Active: 360ns 360ns 360ns 360ns Cmd Recovery: 540ns 540ns 540ns 540ns Data Active: 90ns 90ns 90ns 90ns Data Recovery: 30ns 30ns 90ns 30ns Cycle Time: 120ns 120ns 180ns 120ns Transfer Rate: 16.6MB/s 16.6MB/s 11.1MB/s 16.6MB/s (taken from 2.2) /dev/hda: Model=QUANTUM FIREBALL ST1.6A, FwRev=A0F.0800, SerialNo=851715434518 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=3128/16/63, TrkSize=32256, SectSize=512, ECCbytes=4 BuffType=DualPortCache, BuffSize=81kB, MaxMultSect=16, MultSect=off CurCHS=3128/16/63, CurSects=3153024, LBA=yes, LBAsects=3153024 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 AdvancedPM=no Drive Supports : Reserved : ATA-1 ATA-2 ATA-3 /dev/hdb: Model=QUANTUM FIREBALL ST1.6A, FwRev=A0F.0400, SerialNo=851712135299 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=3128/16/63, TrkSize=32256, SectSize=512, ECCbytes=4 BuffType=DualPortCache, BuffSize=81kB, MaxMultSect=16, MultSect=off CurCHS=3128/16/63, CurSects=3153024, LBA=yes, LBAsects=3153024 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 AdvancedPM=no Drive Supports : Reserved : ATA-1 ATA-2 ATA-3 /dev/hdd: Model=ST32531A, FwRev=0.62, SerialNo=VE047143 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } RawCHS=4956/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=unknown, BuffSize=0kB, MaxMultSect=16, MultSect=off CurCHS=4956/16/63, CurSects=4996476, LBA=yes, LBAsects=4996476 IORDY=on/off, tPIO={min:383,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio3 pio4 DMA modes: mdma0 mdma1 *mdma2 AdvancedPM=no Segmentation fault (yes, hdparm 4.5-1.2 from Debian testing segfaults on this one and it's a userspace segfault, not a syscall one, as shown by strace). 00:00.0 Host bridge: Intel Corp. 430TX - 82439TX MTXC (rev 01) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- at 0x220 irq 5 dma 1,5 at 0x330 irq 5 dma 0,0 sb: 1 Soundblaster PnP card(s) found. YM3812 and OPL-3 driver Copyright (C) by Hannu Savolainen, Rob Hooft 1993-1996 at 0x388 kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.17, 10 Jan 2002 on ide1(22,65), internal journal EXT3-fs: mounted filesystem with ordered data mode. Linux Tulip driver version 0.9.15-pre12 (Aug 9, 2002) tulip0: EEPROM default media type Autosense. tulip0: Index #0 - Media MII (#11) described by a 21142 MII PHY (3) block. tulip0: MII transceiver #17 config 1000 status 782d advertising 01e1. eth0: Digital DS21143 Tulip rev 65 at 0xc4859000, 00:48:54:12:83:3F, IRQ 10. 8139too Fast Ethernet driver 0.9.26 eth1: RealTek RTL8139 Fast Ethernet at 0xc4862000, 00:50:22:82:62:f0, IRQ 11 eth1: Identified 8139 chip type 'RTL-8139C' eth1: Setting half-duplex based on auto-negotiated partner ability 0000. eth0: Setting full-duplex based on MII#17 link partner capability of 45e1. Installing knfsd (copyright (C) 1996 okir@monad.swb.de). -- Meelis Roos (mroos@tartu.cyber.ee) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/