Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750961AbWCFSnU (ORCPT ); Mon, 6 Mar 2006 13:43:20 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751777AbWCFSnU (ORCPT ); Mon, 6 Mar 2006 13:43:20 -0500 Received: from nproxy.gmail.com ([64.233.182.202]:64032 "EHLO nproxy.gmail.com") by vger.kernel.org with ESMTP id S1750961AbWCFSnT convert rfc822-to-8bit (ORCPT ); Mon, 6 Mar 2006 13:43:19 -0500 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:from:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:message-id; b=XkXAWFjuFQf2NcjJCgjIkjWzn3/Fa8nJQt2BGwyfDk6fVgGSB9p/UKhJIOIf+DJVSAr4v6eWHpJvUbSaWBOj7914ItFEZpsnwWurIgdi2T/JU7dW2GIMtvJHUCfMPx3MBS9EL8lru6HEcllV69T0/AifbzRk68sSYoElO1gX6HQ= From: Jesper Juhl To: Linus Torvalds Subject: Re: Slab corruption in 2.6.16-rc5-mm2 Date: Mon, 6 Mar 2006 19:43:35 +0100 User-Agent: KMail/1.9.1 Cc: Linux Kernel Mailing List , Andrew Morton , markhe@nextd.demon.co.uk, Andrea Arcangeli , Mike Christie , James Bottomley , Jesper Juhl References: <200603060117.16484.jesper.juhl@gmail.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT Content-Disposition: inline Message-Id: <200603061943.35502.jesper.juhl@gmail.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 25664 Lines: 474 On Monday 06 March 2006 19:25, Linus Torvalds wrote: > <...snip...> > Anyway, Jesper, I see two potential reasons for this bug: > > - total and utter slab confusion (the slab layer returned the same slab > allocation twice to two different callers). I consider this pretty > unlikely, because it's such a _major_ failure of the slab code, and the > slab code hasn't changed that much, but I mention it just in case. > > - SCSI layer breakage. It might well be the low-level driver completing a > request too early, or it migth be the re-trying. If it's the re-trying, > you could try just reverting that commit I pointed to (ie if you're a > git user, just do "git revert 17e01f21", otherwise you'd need to look > it up from gitweb and un-apply the patch) > Not a git user (I need to become one but haven't found the time to read up on it yet), but no problem, I'll dig out the patch and try reverting it. Luckily it seems this is pretty repeatable on every boot, I find it in the logs instantly after logging in and launching a shell on my KDE desktop and running dmesg - I'll do a few more reboots to make sure it *really* is reproducible before reverting the patch so we can be sure if it fixes the problem or not. Btw, the messages turn out slightly different on each boot, here are the ones from this current boot of my box: Slab corruption: start=f72b6b98, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](sr_do_ioctl+0x11b/0x270) 000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Prev obj: start=f72b6b4c, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](free_fdtable_rcu+0x66/0x150) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=f72b6be4, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<00000000>](_stext+0x3feffd68/0x8) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Slab corruption: start=f72b6b98, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](sr_do_ioctl+0x11b/0x270) 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Prev obj: start=f72b6b4c, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](free_fdtable_rcu+0x66/0x150) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=f72b6be4, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<00000000>](_stext+0x3feffd68/0x8) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Slab corruption: start=f72b6b98, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](ext3_clear_inode+0x29/0x40) 000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 010: 00 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Prev obj: start=f72b6b4c, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](free_fdtable_rcu+0x66/0x150) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=f72b6be4, len=64 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<00000000>](_stext+0x3feffd68/0x8) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Would gathering more of these help you out? > Regardless, Jesper, it would be great to hear _what_ strange CDROM device > you have that would implied in sr_ioctl.c - is it USB, SATA or something > else? > I have no USB, SATA or similar devices in the box, only a floppy drive, a SCSI harddisk, a SCSI CD writer and a SCSI DVD-ROM. Here are some details : $ cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 04 Lun: 00 Vendor: PIONEER Model: DVD-ROM DVD-305 Rev: 1.03 Type: CD-ROM ANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 05 Lun: 00 Vendor: PLEXTOR Model: CD-R PX-W1210S Rev: 1.01 Type: CD-ROM ANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 06 Lun: 00 Vendor: IBM Model: DDYS-T36950N Rev: S96H Type: Direct-Access ANSI SCSI revision: 03 >From dmesg : scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs Vendor: PIONEER Model: DVD-ROM DVD-305 Rev: 1.03 Type: CD-ROM ANSI SCSI revision: 02 target0:0:4: Beginning Domain Validation target0:0:4: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16) target0:0:4: Domain Validation skipping write tests target0:0:4: Ending Domain Validation Vendor: PLEXTOR Model: CD-R PX-W1210S Rev: 1.01 Type: CD-ROM ANSI SCSI revision: 02 target0:0:5: Beginning Domain Validation target0:0:5: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16) target0:0:5: Domain Validation skipping write tests target0:0:5: Ending Domain Validation Vendor: IBM Model: DDYS-T36950N Rev: S96H Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:6:0: Tagged Queuing enabled. Depth 200 target0:0:6: Beginning Domain Validation target0:0:6: wide asynchronous target0:0:6: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 63) target0:0:6: Ending Domain Validation SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB) sda: Write Protect is off sda: Mode Sense: cb 00 00 08 SCSI device sda: drive cache: write back SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB) sda: Write Protect is off sda: Mode Sense: cb 00 00 08 SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 sda4 sd 0:0:6:0: Attached scsi disk sda sr0: scsi3-mmc drive: 16x/40x cd/rw xa/form2 cdda tray Uniform CD-ROM driver Revision: 3.20 sr 0:0:4:0: Attached scsi CD-ROM sr0 sr1: scsi3-mmc drive: 32x/32x writer cd/rw xa/form2 cdda tray sr 0:0:5:0: Attached scsi CD-ROM sr1 sr 0:0:4:0: Attached scsi generic sg0 type 5 sr 0:0:5:0: Attached scsi generic sg1 type 5 sd 0:0:6:0: Attached scsi generic sg2 type 0 # lspci -vvx 00:00.0 Host bridge: ALi Corporation M1695 K8 Northbridge [PCI Express and HyperTransport] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable- Address: 00000000fee00000 Data: 0000 Capabilities: [58] #10 [0141] Capabilities: [7c] #08 [a800] Capabilities: [88] #08 [8825] 00: b9 10 4b 52 06 01 10 00 00 00 04 06 10 00 01 00 10: 00 00 00 00 00 00 00 00 00 01 01 00 f0 00 00 00 20: 20 ff 20 ff f0 ff 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 40 00 00 00 00 00 00 00 0a 01 03 00 00:02.0 PCI bridge: ALi Corporation: Unknown device 524c (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable- Address: 00000000fee00000 Data: 0000 Capabilities: [58] #10 [0141] Capabilities: [7c] #08 [a800] Capabilities: [88] #08 [8825] 00: b9 10 4c 52 06 01 10 00 00 00 04 06 10 00 01 00 10: 00 00 00 00 00 00 00 00 00 02 02 00 f0 00 00 00 20: 30 ff 30 ff f0 ff 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 40 00 00 00 00 00 00 00 0b 01 03 00 00:04.0 Host bridge: ALi Corporation M1689 K8 Northbridge [Super K8 Single Chip] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 00: b9 10 89 16 06 01 10 00 00 00 00 06 00 00 00 00 10: 08 00 00 dc 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00:05.0 PCI bridge: ALi Corporation AGP8X Controller (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- 00: b9 10 46 52 07 01 20 00 00 00 04 06 00 00 01 00 10: 00 00 00 00 00 00 00 00 00 03 03 40 f0 00 20 22 20: 40 ff 40 ff f0 c7 e0 d7 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0b 00 00:06.0 PCI bridge: ALi Corporation M5249 HTT to PCI Bridge (prog-if 01 [Subtractive decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- 00: b9 10 49 52 07 01 00 00 00 01 04 06 00 00 01 00 10: 00 00 00 00 00 00 00 00 00 04 04 20 d0 d0 00 22 20: 50 ff 50 ff 00 88 00 88 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 00 00:07.0 ISA bridge: ALi Corporation M1563 HyperTransport South Bridge (rev 70) Subsystem: ASRock Incorporation: Unknown device 1563 Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- Region 1: I/O ports at Region 2: I/O ports at Region 3: I/O ports at Region 4: I/O ports at ff00 [size=16] 00: b9 10 29 52 05 00 a0 02 c7 8a 01 01 00 20 00 00 10: f1 01 00 00 f5 03 00 00 71 01 00 00 75 03 00 00 20: 01 ff 00 00 00 00 00 00 00 00 00 00 49 18 29 52 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00:13.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03) (prog-if 10 [OHCI]) Subsystem: ASRock Incorporation: Unknown device 5237 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- 00: 2b 10 27 05 07 00 b0 02 03 00 00 03 10 20 00 00 10: 08 00 00 c8 00 e0 4f ff 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 2b 10 40 08 30: 00 00 4c ff dc 00 00 00 00 00 00 00 05 01 10 20 04:05.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 0a) Subsystem: Creative Labs SBLive! 5.1 eMicro 28028 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR-