2006-03-06 00:16:59

by Jesper Juhl

[permalink] [raw]
Subject: Slab corruption in 2.6.16-rc5-mm2


Just found the following in dmesg :

Slab corruption: start=f72948a0, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f7294854, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c0173923>](real_lookup+0x93/0xe0)
000: 6c 69 62 6b 64 65 69 6e 69 74 5f 6b 69 6f 5f 68
010: 74 74 70 5f 63 61 63 68 65 5f 63 6c 65 61 6e 65
Next obj: start=f72948ec, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c01c7c80>](ext3_init_block_alloc_info+0x20/0x70)
000: 10 cf ce f7 00 00 00 00 00 00 00 00 00 00 00 00
010: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Slab corruption: start=f70aeab4, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f70aea68, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c023d6df>](init_dev+0x5cf/0x630)
000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Next obj: start=f70aeb00, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c0173923>](real_lookup+0x93/0xe0)
000: 6c 69 62 62 6f 6f 73 74 5f 70 72 67 5f 65 78 65
010: 63 5f 6d 6f 6e 69 74 6f 72 2d 67 63 63 2d 31 5f


Machine is running 2.6.16-rc5-mm2 :
$ uname -a
Linux dragon 2.6.16-rc5-mm2 #1 SMP PREEMPT Mon Mar 6 00:06:54 CET 2006 i686 athlon-4 i386 GNU/Linux

CPU is a dualcore Athlon X2 4400+

Kernel is 32bit, build for i386.

The machine has 2GB of RAM.

Let me know what additional info would be useful, if any.

If patches need testing then just send them my way.


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html


2006-03-06 18:25:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Mon, 6 Mar 2006, Jesper Juhl wrote:
>
> Slab corruption: start=f72948a0, len=64
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
> 000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
> 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> Prev obj: start=f7294854, len=64
> Redzone: 0x170fc2a5/0x170fc2a5.
> Last user: [<c0173923>](real_lookup+0x93/0xe0)
> 000: 6c 69 62 6b 64 65 69 6e 69 74 5f 6b 69 6f 5f 68
> 010: 74 74 70 5f 63 61 63 68 65 5f 63 6c 65 61 6e 65
> Next obj: start=f72948ec, len=64
> Redzone: 0x170fc2a5/0x170fc2a5.
> Last user: [<c01c7c80>](ext3_init_block_alloc_info+0x20/0x70)
> 000: 10 cf ce f7 00 00 00 00 00 00 00 00 00 00 00 00
> 010: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Ok, this is interesting because the previous and the next objects look
fine, which implies that it's not something else that overwrote it.

Perhaps more importantly, the actual corrupted object _should_ contain the
POISON_FREE bytes, but doesn't. In fact, what it _does_ contain is
apparently perfectly valid SCSI sense request data (if I read it right,
it's ASC/ASCQ of 24/00 means "Invalid field in cdb").

So that "last user: sr_do_ioctl" thing actually matches what the contents
are.

Your other smal corruption:

> Slab corruption: start=f70aeab4, len=64
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
> 000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
> 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> Prev obj: start=f70aea68, len=64
> Redzone: 0x170fc2a5/0x170fc2a5.
> Last user: [<c023d6df>](init_dev+0x5cf/0x630)
> 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> Next obj: start=f70aeb00, len=64
> Redzone: 0x170fc2a5/0x170fc2a5.
> Last user: [<c0173923>](real_lookup+0x93/0xe0)
> 000: 6c 69 62 62 6f 6f 73 74 5f 70 72 67 5f 65 78 65
> 010: 63 5f 6d 6f 6e 69 74 6f 72 2d 67 63 63 2d 31 5f

is exactly the same thing. The objects around it look ok. The object that
should be poisoned again looks like it contains normal request-sense data
(this time 3a/01: "Medium not present - tray closed", if I read it right).

So it really looks like in this case, the sr_do_ioctl() function free'd
the sense key object, but that the sense data was actually written _after_
the object was free'd.

That in turn would imply that scsi_execute() returned before the SCSI
command had actually completed. Now, scsi_execute() just calls to the
generic block device layer, which in turn just submits it, and then waits
for its completion, so if it returns too early, that means that it got
_completed_ too early by the driver.

Alternatively, it got re-tried. The reason I mention that is that we have
commit 17e01f216b611fc46956dcd9063aec4de75991e3, which changes
scsi_execute() to add retries.

I wonder if something does a "complete(rq->waiting)" while the thing is
still retrying? In general, I do not believe that we should retry special
commands that have been initiated by a user, we should return the error.
But I haven't thought this through.

Anyway, Jesper, I see two potential reasons for this bug:

- total and utter slab confusion (the slab layer returned the same slab
allocation twice to two different callers). I consider this pretty
unlikely, because it's such a _major_ failure of the slab code, and the
slab code hasn't changed that much, but I mention it just in case.

- SCSI layer breakage. It might well be the low-level driver completing a
request too early, or it migth be the re-trying. If it's the re-trying,
you could try just reverting that commit I pointed to (ie if you're a
git user, just do "git revert 17e01f21", otherwise you'd need to look
it up from gitweb and un-apply the patch)

Regardless, Jesper, it would be great to hear _what_ strange CDROM device
you have that would implied in sr_ioctl.c - is it USB, SATA or something
else?

James, Mike, can you double-check the retries? In particular, it's _wrong_
to retry after you've already marked a command completed with
"complete(rq->waiting)", so if that happens somewhere, things are really
broken.

Linus

2006-03-06 18:50:01

by Mike Christie

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Mike Christie wrote:
> Linus Torvalds wrote:
>
>>
>> James, Mike, can you double-check the retries? In particular, it's
>> _wrong_ to retry after you've already marked a command completed with
>> "complete(rq->waiting)", so if that happens somewhere, things are
>> really broken.
>>
>
> I am looking into it. I think it has something to do with the request
> getting completed too early or maybe something crazy like twice. This
> looks like a similar problem that was reported to linux-scsi

Oh yeah here is that thread

http://marc.theaimsgroup.com/?l=linux-scsi&m=114127615918030&w=2

2006-03-06 18:43:20

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Monday 06 March 2006 19:25, Linus Torvalds wrote:
>
<...snip...>
> Anyway, Jesper, I see two potential reasons for this bug:
>
> - total and utter slab confusion (the slab layer returned the same slab
> allocation twice to two different callers). I consider this pretty
> unlikely, because it's such a _major_ failure of the slab code, and the
> slab code hasn't changed that much, but I mention it just in case.
>
> - SCSI layer breakage. It might well be the low-level driver completing a
> request too early, or it migth be the re-trying. If it's the re-trying,
> you could try just reverting that commit I pointed to (ie if you're a
> git user, just do "git revert 17e01f21", otherwise you'd need to look
> it up from gitweb and un-apply the patch)
>
Not a git user (I need to become one but haven't found the time to read up
on it yet), but no problem, I'll dig out the patch and try reverting it.
Luckily it seems this is pretty repeatable on every boot, I find it in the
logs instantly after logging in and launching a shell on my KDE desktop and
running dmesg - I'll do a few more reboots to make sure it *really* is
reproducible before reverting the patch so we can be sure if it fixes the
problem or not.

Btw, the messages turn out slightly different on each boot, here are the
ones from this current boot of my box:

Slab corruption: start=f72b6b98, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f72b6b4c, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813e6>](free_fdtable_rcu+0x66/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f72b6be4, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<00000000>](_stext+0x3feffd68/0x8)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Slab corruption: start=f72b6b98, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f72b6b4c, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813e6>](free_fdtable_rcu+0x66/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f72b6be4, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<00000000>](_stext+0x3feffd68/0x8)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Slab corruption: start=f72b6b98, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01d3769>](ext3_clear_inode+0x29/0x40)
000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
010: 00 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Prev obj: start=f72b6b4c, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813e6>](free_fdtable_rcu+0x66/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f72b6be4, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<00000000>](_stext+0x3feffd68/0x8)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b

Would gathering more of these help you out?


> Regardless, Jesper, it would be great to hear _what_ strange CDROM device
> you have that would implied in sr_ioctl.c - is it USB, SATA or something
> else?
>
I have no USB, SATA or similar devices in the box, only a floppy drive, a
SCSI harddisk, a SCSI CD writer and a SCSI DVD-ROM.
Here are some details :


$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 04 Lun: 00
Vendor: PIONEER Model: DVD-ROM DVD-305 Rev: 1.03
Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 05 Lun: 00
Vendor: PLEXTOR Model: CD-R PX-W1210S Rev: 1.01
Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 06 Lun: 00
Vendor: IBM Model: DDYS-T36950N Rev: S96H
Type: Direct-Access ANSI SCSI revision: 03


>From dmesg :

scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
<Adaptec 29160N Ultra160 SCSI adapter>
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

Vendor: PIONEER Model: DVD-ROM DVD-305 Rev: 1.03
Type: CD-ROM ANSI SCSI revision: 02
target0:0:4: Beginning Domain Validation
target0:0:4: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16)
target0:0:4: Domain Validation skipping write tests
target0:0:4: Ending Domain Validation
Vendor: PLEXTOR Model: CD-R PX-W1210S Rev: 1.01
Type: CD-ROM ANSI SCSI revision: 02
target0:0:5: Beginning Domain Validation
target0:0:5: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16)
target0:0:5: Domain Validation skipping write tests
target0:0:5: Ending Domain Validation
Vendor: IBM Model: DDYS-T36950N Rev: S96H
Type: Direct-Access ANSI SCSI revision: 03
scsi0:A:6:0: Tagged Queuing enabled. Depth 200
target0:0:6: Beginning Domain Validation
target0:0:6: wide asynchronous
target0:0:6: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 63)
target0:0:6: Ending Domain Validation
SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB)
sda: Write Protect is off
sda: Mode Sense: cb 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB)
sda: Write Protect is off
sda: Mode Sense: cb 00 00 08
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4
sd 0:0:6:0: Attached scsi disk sda
sr0: scsi3-mmc drive: 16x/40x cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 0:0:4:0: Attached scsi CD-ROM sr0
sr1: scsi3-mmc drive: 32x/32x writer cd/rw xa/form2 cdda tray
sr 0:0:5:0: Attached scsi CD-ROM sr1
sr 0:0:4:0: Attached scsi generic sg0 type 5
sr 0:0:5:0: Attached scsi generic sg1 type 5
sd 0:0:6:0: Attached scsi generic sg2 type 0


# lspci -vvx
00:00.0 Host bridge: ALi Corporation M1695 K8 Northbridge [PCI Express and HyperTransport]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Capabilities: [40] #08 [0060]
Capabilities: [5c] #08 [a800]
Capabilities: [68] #08 [9000]
Capabilities: [74] #08 [8000]
Capabilities: [7c] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable-
Address: 00000000fee00000 Data: 0000
00: b9 10 95 16 07 00 10 00 00 00 00 06 00 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00

00:01.0 PCI bridge: ALi Corporation: Unknown device 524b (prog-if 00 [Normal decode])
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0, cache line size 10
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
Memory behind bridge: ff200000-ff2fffff
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable-
Address: 00000000fee00000 Data: 0000
Capabilities: [58] #10 [0141]
Capabilities: [7c] #08 [a800]
Capabilities: [88] #08 [8825]
00: b9 10 4b 52 06 01 10 00 00 00 04 06 10 00 01 00
10: 00 00 00 00 00 00 00 00 00 01 01 00 f0 00 00 00
20: 20 ff 20 ff f0 ff 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 0a 01 03 00

00:02.0 PCI bridge: ALi Corporation: Unknown device 524c (prog-if 00 [Normal decode])
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0, cache line size 10
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
Memory behind bridge: ff300000-ff3fffff
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable-
Address: 00000000fee00000 Data: 0000
Capabilities: [58] #10 [0141]
Capabilities: [7c] #08 [a800]
Capabilities: [88] #08 [8825]
00: b9 10 4c 52 06 01 10 00 00 00 04 06 10 00 01 00
10: 00 00 00 00 00 00 00 00 00 02 02 00 f0 00 00 00
20: 30 ff 30 ff f0 ff 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 0b 01 03 00

00:04.0 Host bridge: ALi Corporation M1689 K8 Northbridge [Super K8 Single Chip]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Region 0: Memory at dc000000 (32-bit, prefetchable) [size=64M]
Capabilities: [40] #08 [0024]
Capabilities: [60] #08 [8038]
Capabilities: [80] AGP version 3.0
Status: RQ=28 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW- AGP3- Rate=x1,x2,x4
Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=<none>
00: b9 10 89 16 06 01 10 00 00 00 00 06 00 00 00 00
10: 08 00 00 dc 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00

00:05.0 PCI bridge: ALi Corporation AGP8X Controller (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Bus: primary=00, secondary=03, subordinate=03, sec-latency=64
Memory behind bridge: ff400000-ff4fffff
Prefetchable memory behind bridge: c7f00000-d7efffff
BridgeCtl: Parity+ SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B-
00: b9 10 46 52 07 01 20 00 00 00 04 06 00 00 01 00
10: 00 00 00 00 00 00 00 00 00 03 03 40 f0 00 20 22
20: 40 ff 40 ff f0 c7 e0 d7 00 00 00 00 00 00 00 00
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0b 00

00:06.0 PCI bridge: ALi Corporation M5249 HTT to PCI Bridge (prog-if 01 [Subtractive decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Bus: primary=00, secondary=04, subordinate=04, sec-latency=32
I/O behind bridge: 0000d000-0000dfff
Memory behind bridge: ff500000-ff5fffff
Prefetchable memory behind bridge: 88000000-880fffff
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
00: b9 10 49 52 07 01 00 00 00 01 04 06 00 00 01 00
10: 00 00 00 00 00 00 00 00 00 04 04 20 d0 d0 00 22
20: 50 ff 50 ff 00 88 00 88 00 00 00 00 00 00 00 00
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 00

00:07.0 ISA bridge: ALi Corporation M1563 HyperTransport South Bridge (rev 70)
Subsystem: ASRock Incorporation: Unknown device 1563
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0 (250ns min, 6000ns max)
00: b9 10 63 15 0f 00 00 02 70 00 01 06 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 49 18 63 15
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 18

00:07.1 Bridge: ALi Corporation M7101 Power Management Controller [PMU]
Subsystem: ASRock Incorporation: Unknown device 7101
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
00: b9 10 01 71 00 00 00 02 00 00 80 06 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 49 18 01 71
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:11.0 Ethernet controller: ALi Corporation M5263 Ethernet Controller (rev 40)
Subsystem: ASRock Incorporation: Unknown device 5263
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (5000ns min, 10000ns max), cache line size 08
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at e800 [size=256]
Region 1: Memory at ff6ffc00 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: b9 10 63 52 07 01 10 02 40 00 00 02 08 20 00 00
10: 01 e8 00 00 00 fc 6f ff 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 49 18 63 52
30: 00 00 00 00 50 00 00 00 00 00 00 00 0a 01 14 28

00:12.0 IDE interface: ALi Corporation M5229 IDE (rev c7) (prog-if 8a [Master SecP PriP])
Subsystem: ASRock Incorporation: Unknown device 5229
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Interrupt: pin A routed to IRQ 0
Region 0: I/O ports at <ignored>
Region 1: I/O ports at <ignored>
Region 2: I/O ports at <ignored>
Region 3: I/O ports at <ignored>
Region 4: I/O ports at ff00 [size=16]
00: b9 10 29 52 05 00 a0 02 c7 8a 01 01 00 20 00 00
10: f1 01 00 00 f5 03 00 00 71 01 00 00 75 03 00 00
20: 01 ff 00 00 00 00 00 00 00 00 00 00 49 18 29 52
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00

00:13.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03) (prog-if 10 [OHCI])
Subsystem: ASRock Incorporation: Unknown device 5237
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (20000ns max), cache line size 10
Interrupt: pin A routed to IRQ 11
Region 0: Memory at ff6fe000 (32-bit, non-prefetchable) [size=4K]
00: b9 10 37 52 17 01 a8 02 03 10 03 0c 10 20 80 00
10: 00 e0 6f ff 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 49 18 37 52
30: 00 00 00 00 00 00 00 00 00 00 00 00 0b 01 00 50

00:13.1 USB Controller: ALi Corporation USB 1.1 Controller (rev 03) (prog-if 10 [OHCI])
Subsystem: ASRock Incorporation: Unknown device 5237
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (20000ns max), cache line size 10
Interrupt: pin B routed to IRQ 3
Region 0: Memory at ff6fd000 (32-bit, non-prefetchable) [size=4K]
00: b9 10 37 52 17 01 a8 02 03 10 03 0c 10 20 80 00
10: 00 d0 6f ff 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 49 18 37 52
30: 00 00 00 00 00 00 00 00 00 00 00 00 03 02 00 50

00:13.2 USB Controller: ALi Corporation USB 1.1 Controller (rev 03) (prog-if 10 [OHCI])
Subsystem: ASRock Incorporation: Unknown device 5237
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (20000ns max), cache line size 10
Interrupt: pin C routed to IRQ 11
Region 0: Memory at ff6fc000 (32-bit, non-prefetchable) [size=4K]
00: b9 10 37 52 17 01 a8 02 03 10 03 0c 10 20 80 00
10: 00 c0 6f ff 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 49 18 37 52
30: 00 00 00 00 00 00 00 00 00 00 00 00 0b 03 00 50

00:13.3 USB Controller: ALi Corporation USB 2.0 Controller (rev 01) (prog-if 20 [EHCI])
Subsystem: ASRock Incorporation: Unknown device 5239
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (4000ns min, 8000ns max), cache line size 10
Interrupt: pin D routed to IRQ 5
Region 0: Memory at ff6ff800 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] #0a [2090]
00: b9 10 39 52 16 01 b0 02 01 20 03 0c 10 20 80 00
10: 00 f8 6f ff 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 49 18 39 52
30: 00 00 00 00 50 00 00 00 00 00 00 00 05 04 10 20

00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Capabilities: [80] #08 [2101]
00: 22 10 00 11 00 00 10 00 00 00 00 06 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 80 00 00 00 00 00 00 00 00 00 00 00

00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
00: 22 10 01 11 00 00 00 00 00 00 00 06 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
00: 22 10 02 11 00 00 00 00 00 00 00 06 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
00: 22 10 03 11 00 00 00 00 00 00 00 06 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

03:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA Parhelia AGP (rev 03) (prog-if 00 [VGA])
Subsystem: Matrox Graphics, Inc. Parhelia 128Mb
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (4000ns min, 8000ns max), cache line size 10
Interrupt: pin A routed to IRQ 5
Region 0: Memory at c8000000 (32-bit, prefetchable) [size=128M]
Region 1: Memory at ff4fe000 (32-bit, non-prefetchable) [size=8K]
Expansion ROM at ff4c0000 [disabled] [size=128K]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [f0] AGP version 2.0
Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+ AGP3- Rate=x1,x2,x4
Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=<none>
00: 2b 10 27 05 07 00 b0 02 03 00 00 03 10 20 00 00
10: 08 00 00 c8 00 e0 4f ff 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 2b 10 40 08
30: 00 00 4c ff dc 00 00 00 00 00 00 00 05 01 10 20

04:05.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 0a)
Subsystem: Creative Labs SBLive! 5.1 eMicro 28028
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (500ns min, 5000ns max)
Interrupt: pin A routed to IRQ 20
Region 0: I/O ports at d880 [size=32]
Capabilities: [dc] Power Management version 1
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 02 11 02 00 05 01 90 02 0a 00 01 04 00 20 80 00
10: 81 d8 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 02 11 67 80
30: 00 00 00 00 dc 00 00 00 00 00 00 00 0b 01 02 14

04:05.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 0a)
Subsystem: Creative Labs Gameport Joystick
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Region 0: I/O ports at dc00 [size=8]
Capabilities: [dc] Power Management version 1
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 02 11 02 70 05 01 90 02 0a 00 80 09 00 20 80 00
10: 01 dc 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 02 11 20 00
30: 00 00 00 00 dc 00 00 00 00 00 00 00 00 00 00 00

04:06.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
Subsystem: Adaptec 29160N Ultra160 SCSI Controller
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (10000ns min, 6250ns max), cache line size 10
Interrupt: pin A routed to IRQ 19
BIST result: 00
Region 0: I/O ports at d400 [disabled] [size=256]
Region 1: Memory at ff5ff000 (64-bit, non-prefetchable) [size=4K]
Expansion ROM at 88000000 [disabled] [size=128K]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 05 90 80 00 16 01 b0 02 02 00 00 01 10 20 00 80
10: 01 d4 00 00 04 f0 5f ff 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 05 90 a0 62
30: 00 00 5c ff dc 00 00 00 00 00 00 00 03 01 28 19

04:07.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 42)
Subsystem: D-Link System Inc DFE-530TX rev B
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (750ns min, 2000ns max), cache line size 10
Interrupt: pin A routed to IRQ 18
Region 0: I/O ports at d000 [size=256]
Region 1: Memory at ff5fec00 (32-bit, non-prefetchable) [size=256]
Expansion ROM at 88020000 [disabled] [size=64K]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 65 30 17 01 10 02 42 00 00 02 10 20 00 00
10: 01 d0 00 00 00 ec 5f ff 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 11 01 14
30: 00 00 ff ff 40 00 00 00 00 00 00 00 02 01 03 08



/Jesper

2006-03-06 18:48:57

by Mike Christie

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Linus Torvalds wrote:
>
> James, Mike, can you double-check the retries? In particular, it's _wrong_
> to retry after you've already marked a command completed with
> "complete(rq->waiting)", so if that happens somewhere, things are really
> broken.
>

I am looking into it. I think it has something to do with the request
getting completed too early or maybe something crazy like twice. This
looks like a similar problem that was reported to linux-scsi where for
some tape setup the request's bio gets freed twice.

2006-03-06 19:51:25

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Monday 06 March 2006 20:32, Linus Torvalds wrote:
>
> On Mon, 6 Mar 2006, Jesper Juhl wrote:
> >
> > Not a git user (I need to become one but haven't found the time to read up
> > on it yet), but no problem, I'll dig out the patch and try reverting it.
>
> It's attached here.
>
Thanks.


> NOTE! I'm not at all sure it's the re-try logic. It could be something
> else. Anything that completes the request before it's actually totally
> done - or possibly re-uses the sense data for something else would be
> wrong and buggy.
>
Ohh well, let's work on the assumption that it is the re-try logic first,
then try something else if it turns out it isn't. I have no problem testing
a bunch of patches if needed.

<...>
> This is different. But it looks similar. It looks like the thing was
> actually re-allocated for something else (posix acl data?) but then

I doubt it's POSIX ACL data :

juhl@dragon:~/download/kernel/linux-2.6.16-rc5-mm2$ grep -i ACL .config
# CONFIG_FS_POSIX_ACL is not set


> overwritten. However, the overwritten data does look like SCSI sense
> information again ("Invalid field in cdb"), so I think it's the same
> thing despite the fact that it had gotten re-allocated for something else.
>
> > Would gathering more of these help you out?
>
> It's always interesting when trying to find the pattern, but I think the
> pattern is already pretty clear. sr_do_ioctl() seems to be the thing, and
> sense data is written too late.
>
Ok, it's reproducible on demand (at least I've now reproduced it on 6 more
boots), so if you need any more just let me know and I'll gather a few.


> > I have no USB, SATA or similar devices in the box, only a floppy drive, a
> > SCSI harddisk, a SCSI CD writer and a SCSI DVD-ROM.
>
> Well, the fact that you have a CDSI CD-writer and a SCSI DVD-ROM explains
> the thing, so that's all good.
>
> > scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
> > <Adaptec 29160N Ultra160 SCSI adapter>
> > aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
>
> So it's either an aic7xxx bug, or it's generic SCSI.
>
> Considering that we've had other slab corruption issues (the reason I was
> looking closely at yours), generic SCSI isn't out of the question.
>
> If you were a git user, doing a bisection run would be useful since you

Well, now is probably as good a time as any for becoming a git user,
tracking my own patches as individual plain-text files is getting
un-managable and as you say, bisection would be useful to be able to do.
I'll dig up some git docs and start reading.


> seem to be able to recreate it at will. Oh, well. Testign that one patch
> would still help.
>
Hmm, that patch does not apply to the 2.6.16-rc5-mm2 kernel :

patching file drivers/scsi/scsi_lib.c
Hunk #1 succeeded at 260 (offset 1 line).
Hunk #2 FAILED at 473.
Hunk #3 FAILED at 1394.
2 out of 3 hunks FAILED -- saving rejects to file drivers/scsi/scsi_lib.c.rej
patching file include/linux/blkdev.h
Hunk #1 succeeded at 190 (offset 6 lines).

I'll go see if the problem also exists in mainline - will report on that
shortly.


/Jesper


2006-03-06 19:32:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Mon, 6 Mar 2006, Jesper Juhl wrote:
>
> Not a git user (I need to become one but haven't found the time to read up
> on it yet), but no problem, I'll dig out the patch and try reverting it.

It's attached here.

NOTE! I'm not at all sure it's the re-try logic. It could be something
else. Anything that completes the request before it's actually totally
done - or possibly re-uses the sense data for something else would be
wrong and buggy.

> Btw, the messages turn out slightly different on each boot, here are the
> ones from this current boot of my box:
>
> Slab corruption: start=f72b6b98, len=64
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
> 000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
> 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Ok, same deal. "Medium not present - tray closed" sense data.

> Slab corruption: start=f72b6b98, len=64
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
> 000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Hmm. Totally empty sense data? Strange.

> Slab corruption: start=f72b6b98, len=64
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<c01d3769>](ext3_clear_inode+0x29/0x40)
> 000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
> 010: 00 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b

This is different. But it looks similar. It looks like the thing was
actually re-allocated for something else (posix acl data?) but then
overwritten. However, the overwritten data does look like SCSI sense
information again ("Invalid field in cdb"), so I think it's the same
thing despite the fact that it had gotten re-allocated for something else.

> Would gathering more of these help you out?

It's always interesting when trying to find the pattern, but I think the
pattern is already pretty clear. sr_do_ioctl() seems to be the thing, and
sense data is written too late.

> I have no USB, SATA or similar devices in the box, only a floppy drive, a
> SCSI harddisk, a SCSI CD writer and a SCSI DVD-ROM.

Well, the fact that you have a CDSI CD-writer and a SCSI DVD-ROM explains
the thing, so that's all good.

> scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
> <Adaptec 29160N Ultra160 SCSI adapter>
> aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

So it's either an aic7xxx bug, or it's generic SCSI.

Considering that we've had other slab corruption issues (the reason I was
looking closely at yours), generic SCSI isn't out of the question.

If you were a git user, doing a bisection run would be useful since you
seem to be able to recreate it at will. Oh, well. Testign that one patch
would still help.

Linus


Attachments:
diff (2.00 kB)

2006-03-06 19:58:44

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Jesper Juhl <[email protected]> wrote:
> On Monday 06 March 2006 20:32, Linus Torvalds wrote:
> >
> > seem to be able to recreate it at will. Oh, well. Testign that one patch
> > would still help.
> >
> Hmm, that patch does not apply to the 2.6.16-rc5-mm2 kernel :
>
I fixed it up by hand.
Building a new kernel at the moment - results in a short while.

>
> I'll go see if the problem also exists in mainline - will report on that
> shortly.
>
I'll still do this. Just downloading 2.6.16-rc5-git8 as we speak.


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 20:06:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Mon, 6 Mar 2006, Linus Torvalds wrote:
>
> So it's either an aic7xxx bug, or it's generic SCSI.
>
> Considering that we've had other slab corruption issues (the reason I was
> looking closely at yours), generic SCSI isn't out of the question.
>
> If you were a git user, doing a bisection run would be useful since you
> seem to be able to recreate it at will. Oh, well. Testign that one patch
> would still help.

Hmm.. This appended patch may or may not help.

It overwrites the SCSI command "req" pointer when the request has been
done. The request cannot be used afterwards, so anybody accessing it would
be a bug. I think.

HOWEVER. I noticed something else strange. Your slab corruption report
says

Slab corruption: start=f72948a0, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
...

and the scary thing is that "len=64".

The thing is, SCSI uses "SCSI_SENSE_BUFFERSIZE" to determine the maximum
sense size to copy, and what do we have, if not

include/scsi/scsi_cmnd.h:#define SCSI_SENSE_BUFFERSIZE 96

ie a 64-byte buffer is simply TOO DAMN SMALL!

Now, the thing is, the 64 comes from "sizeof(struct request_sense)", which
is what "struct packet_command *" uses. We can change that sizeof() to
just use SCSI_SENSE_BUFFERSIZE, but that still makes me worry about
somebody else having allocated a "packed_command->sense" using just the
same 64-byte "struct request_sense".

Can a SCSI sense-buffer really be 96? If so, why is "struct request_sense"
the wrong size? This looks likea really nasty bug.

Linus

----
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 701a328..296ac8e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -788,6 +788,9 @@ static struct scsi_cmnd *scsi_end_reques
}
}

+ /* Poison the request pointer: it is done and no longer exists after this */
+ cmd->request = (void *) 0x5a5a5a5a;
+
add_disk_randomness(req->rq_disk);

spin_lock_irqsave(q->queue_lock, flags);

2006-03-06 20:24:22

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Monday 06 March 2006 21:06, Linus Torvalds wrote:
>
> On Mon, 6 Mar 2006, Linus Torvalds wrote:
> >
> > So it's either an aic7xxx bug, or it's generic SCSI.
> >
> > Considering that we've had other slab corruption issues (the reason I was
> > looking closely at yours), generic SCSI isn't out of the question.
> >
> > If you were a git user, doing a bisection run would be useful since you
> > seem to be able to recreate it at will. Oh, well. Testign that one patch
> > would still help.
>

Since the patch you sent me didn't apply cleanly to the mm kernel I made the
changes by hand. This is what I ended up with (should be the same end result
as what you intended as far as I can see) :

--- linux-2.6.16-rc5-mm2-orig/drivers/scsi/scsi_lib.c 2006-03-05 23:43:56.000000000 +0100
+++ linux-2.6.16-rc5-mm2/drivers/scsi/scsi_lib.c 2006-03-06 21:13:53.000000000 +0100
@@ -260,7 +260,6 @@ int scsi_execute(struct scsi_device *sde
memcpy(req->cmd, cmd, req->cmd_len);
req->sense = sense;
req->sense_len = 0;
- req->retries = retries;
req->timeout = timeout;
req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL | REQ_QUIET;

@@ -478,7 +477,6 @@ int scsi_execute_async(struct scsi_devic
req->sense = sioc->sense;
req->sense_len = 0;
req->timeout = timeout;
- req->retries = retries;
req->end_io_data = sioc;

sioc->data = privdata;
@@ -1240,7 +1238,7 @@ static void scsi_setup_blk_pc_cmnd(struc
cmd->sc_data_direction = DMA_FROM_DEVICE;

cmd->transfersize = req->data_len;
- cmd->allowed = req->retries;
+ cmd->allowed = 3;
cmd->timeout_per_command = req->timeout;
cmd->done = scsi_blk_pc_done;
}
--- linux-2.6.16-rc5-mm2-orig/block/scsi_ioctl.c 2006-03-05 23:43:41.000000000 +0100
+++ linux-2.6.16-rc5-mm2/block/scsi_ioctl.c 2006-03-06 21:16:19.000000000 +0100
@@ -314,8 +314,6 @@ static int sg_io(struct file *file, requ
if (!rq->timeout)
rq->timeout = BLK_DEFAULT_TIMEOUT;

- rq->retries = 0;
-
start_time = jiffies;

/* ignore return value. All information is passed back to caller
@@ -433,7 +431,6 @@ static int sg_scsi_ioctl(struct file *fi
rq->data = buffer;
rq->data_len = bytes;
rq->flags |= REQ_BLOCK_PC;
- rq->retries = 0;

blk_execute_rq(q, bd_disk, rq, 0);
err = rq->errors & 0xff; /* only 8 bit SCSI status */
--- linux-2.6.16-rc5-mm2-orig/include/linux/blkdev.h 2006-03-05 23:44:06.000000000 +0100
+++ linux-2.6.16-rc5-mm2/include/linux/blkdev.h 2006-03-06 21:13:02.000000000 +0100
@@ -190,7 +190,6 @@ struct request {
void *sense;

unsigned int timeout;
- int retries;

/*
* For Power Management requests


Unfortunately that didn't fix it. After booting the patched kernel I found
this in dmesg :

Slab corruption: start=f77d768c, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934db>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f77d7640, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c023d64d>](init_dev+0x55d/0x630)
000: 00 01 00 00 05 00 00 00 bf 00 00 00 3b 8a 00 00
010: 00 03 1c 7f 15 04 00 01 00 11 13 1a 00 12 0f 17
Next obj: start=f77d76d8, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c01c7c80>](ext3_init_block_alloc_info+0x20/0x70)
000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
010: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Slab corruption: start=f7001640, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934db>](sr_do_ioctl+0x11b/0x270)
000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f70015f4, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<f8bdf124>](__snd_util_mem_alloc+0x74/0x80 [snd_util_mem])
000: 00 10 00 00 00 00 00 00 b0 75 7d f7 50 33 bd f5
010: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00
Next obj: start=f700168c, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c0173923>](real_lookup+0x93/0xe0)
000: 6c 69 62 62 6f 6f 73 74 5f 75 6e 69 74 5f 74 65
010: 73 74 5f 66 72 61 6d 65 77 6f 72 6b 2d 67 63 63


> Hmm.. This appended patch may or may not help.
>
I'll give it a spin.


> It overwrites the SCSI command "req" pointer when the request has been
> done. The request cannot be used afterwards, so anybody accessing it would
> be a bug. I think.
>
Let's see what happens.
I've applied it on top of the changes mentioned above - let me know if
that's wrong.


> HOWEVER. I noticed something else strange. Your slab corruption report
> says
>
> Slab corruption: start=f72948a0, len=64
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
> ...
>
> and the scary thing is that "len=64".
>
> The thing is, SCSI uses "SCSI_SENSE_BUFFERSIZE" to determine the maximum
> sense size to copy, and what do we have, if not
>
> include/scsi/scsi_cmnd.h:#define SCSI_SENSE_BUFFERSIZE 96
>
> ie a 64-byte buffer is simply TOO DAMN SMALL!
>
> Now, the thing is, the 64 comes from "sizeof(struct request_sense)", which
> is what "struct packet_command *" uses. We can change that sizeof() to
> just use SCSI_SENSE_BUFFERSIZE, but that still makes me worry about

I'll try that after booting with the patch you just supplied and let you
know the results shortly.


/Jesper

2006-03-06 20:31:17

by Jens Axboe

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Mon, Mar 06 2006, Jesper Juhl wrote:
> On Monday 06 March 2006 21:06, Linus Torvalds wrote:
> >
> > On Mon, 6 Mar 2006, Linus Torvalds wrote:
> > >
> > > So it's either an aic7xxx bug, or it's generic SCSI.
> > >
> > > Considering that we've had other slab corruption issues (the reason I was
> > > looking closely at yours), generic SCSI isn't out of the question.
> > >
> > > If you were a git user, doing a bisection run would be useful since you
> > > seem to be able to recreate it at will. Oh, well. Testign that one patch
> > > would still help.
> >
>
> Since the patch you sent me didn't apply cleanly to the mm kernel I made the
> changes by hand. This is what I ended up with (should be the same end result
> as what you intended as far as I can see) :
>
> --- linux-2.6.16-rc5-mm2-orig/drivers/scsi/scsi_lib.c 2006-03-05 23:43:56.000000000 +0100
> +++ linux-2.6.16-rc5-mm2/drivers/scsi/scsi_lib.c 2006-03-06 21:13:53.000000000 +0100
> @@ -260,7 +260,6 @@ int scsi_execute(struct scsi_device *sde
> memcpy(req->cmd, cmd, req->cmd_len);
> req->sense = sense;
> req->sense_len = 0;
> - req->retries = retries;
> req->timeout = timeout;
> req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL | REQ_QUIET;
>
> @@ -478,7 +477,6 @@ int scsi_execute_async(struct scsi_devic
> req->sense = sioc->sense;
> req->sense_len = 0;
> req->timeout = timeout;
> - req->retries = retries;
> req->end_io_data = sioc;
>
> sioc->data = privdata;
> @@ -1240,7 +1238,7 @@ static void scsi_setup_blk_pc_cmnd(struc
> cmd->sc_data_direction = DMA_FROM_DEVICE;
>
> cmd->transfersize = req->data_len;
> - cmd->allowed = req->retries;
> + cmd->allowed = 3;
> cmd->timeout_per_command = req->timeout;
> cmd->done = scsi_blk_pc_done;
> }
> --- linux-2.6.16-rc5-mm2-orig/block/scsi_ioctl.c 2006-03-05 23:43:41.000000000 +0100
> +++ linux-2.6.16-rc5-mm2/block/scsi_ioctl.c 2006-03-06 21:16:19.000000000 +0100
> @@ -314,8 +314,6 @@ static int sg_io(struct file *file, requ
> if (!rq->timeout)
> rq->timeout = BLK_DEFAULT_TIMEOUT;
>
> - rq->retries = 0;
> -
> start_time = jiffies;
>
> /* ignore return value. All information is passed back to caller
> @@ -433,7 +431,6 @@ static int sg_scsi_ioctl(struct file *fi
> rq->data = buffer;
> rq->data_len = bytes;
> rq->flags |= REQ_BLOCK_PC;
> - rq->retries = 0;
>
> blk_execute_rq(q, bd_disk, rq, 0);
> err = rq->errors & 0xff; /* only 8 bit SCSI status */
> --- linux-2.6.16-rc5-mm2-orig/include/linux/blkdev.h 2006-03-05 23:44:06.000000000 +0100
> +++ linux-2.6.16-rc5-mm2/include/linux/blkdev.h 2006-03-06 21:13:02.000000000 +0100
> @@ -190,7 +190,6 @@ struct request {
> void *sense;
>
> unsigned int timeout;
> - int retries;
>
> /*
> * For Power Management requests
>
>
> Unfortunately that didn't fix it. After booting the patched kernel I found
> this in dmesg :

I don't see how it could be, honestly, we would gladly oops in locally
close places if that was the case. If you disable slab debug/poison, do
you get a nice NULL pointer dereference instead? There have been some
reports on a NULL queue for sr devices as of lately, I wonder if some
SCSI change recently was broken.

Tejun, I seem to recall you looking at this, but I can't seem to locate
the thread. Did anything come of it?

--
Jens Axboe

2006-03-06 20:33:47

by Jens Axboe

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Mon, Mar 06 2006, Jens Axboe wrote:
> I don't see how it could be, honestly, we would gladly oops in locally
> close places if that was the case. If you disable slab debug/poison, do
> you get a nice NULL pointer dereference instead? There have been some
> reports on a NULL queue for sr devices as of lately, I wonder if some
> SCSI change recently was broken.
>
> Tejun, I seem to recall you looking at this, but I can't seem to locate
> the thread. Did anything come of it?

This is the one:

http://marc.theaimsgroup.com/?l=linux-kernel&m=114041855331295&w=2

Also an -mm report, btw. Does this reproduce with 2.6.16-rcX latest?

--
Jens Axboe

2006-03-06 20:35:57

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Monday 06 March 2006 21:06, Linus Torvalds wrote:
>
> On Mon, 6 Mar 2006, Linus Torvalds wrote:
> >
> > So it's either an aic7xxx bug, or it's generic SCSI.
> >
> > Considering that we've had other slab corruption issues (the reason I was
> > looking closely at yours), generic SCSI isn't out of the question.
> >
> > If you were a git user, doing a bisection run would be useful since you
> > seem to be able to recreate it at will. Oh, well. Testign that one patch
> > would still help.
>
> Hmm.. This appended patch may or may not help.
>
> It overwrites the SCSI command "req" pointer when the request has been
> done. The request cannot be used afterwards, so anybody accessing it would
> be a bug. I think.
>

With the retry code removed and your req poisoning patch on top I just got this :

Slab corruption: start=f727c5a8, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934db>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f727c55c, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813ee>](free_fdtable_rcu+0x6e/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f727c5f4, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813ee>](free_fdtable_rcu+0x6e/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Slab corruption: start=f727c5a8, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934db>](sr_do_ioctl+0x11b/0x270)
000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f727c55c, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813ee>](free_fdtable_rcu+0x6e/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f727c5f4, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813ee>](free_fdtable_rcu+0x6e/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b

and another, probably unrelated, thing I just noticed in my dmesg output:

initcall at 0xc0428240: init_hpet_clocksource+0x0/0x90(): returned with error code -19


> HOWEVER. I noticed something else strange. Your slab corruption report
> says
>
> Slab corruption: start=f72948a0, len=64
> Redzone: 0x5a2cf071/0x5a2cf071.
> Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
> ...
>
> and the scary thing is that "len=64".
>
> The thing is, SCSI uses "SCSI_SENSE_BUFFERSIZE" to determine the maximum
> sense size to copy, and what do we have, if not
>
> include/scsi/scsi_cmnd.h:#define SCSI_SENSE_BUFFERSIZE 96
>
> ie a 64-byte buffer is simply TOO DAMN SMALL!
>
> Now, the thing is, the 64 comes from "sizeof(struct request_sense)", which
> is what "struct packet_command *" uses. We can change that sizeof() to
> just use SCSI_SENSE_BUFFERSIZE, but that still makes me worry about

Building a kernel with that change on top of the other ones atm.


/ Jesper

2006-03-06 20:53:17

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Jesper Juhl <[email protected]> wrote:
> On Monday 06 March 2006 21:06, Linus Torvalds wrote:
> >
<...snip...>
> > and the scary thing is that "len=64".
> >
> > The thing is, SCSI uses "SCSI_SENSE_BUFFERSIZE" to determine the maximum
> > sense size to copy, and what do we have, if not
> >
> > include/scsi/scsi_cmnd.h:#define SCSI_SENSE_BUFFERSIZE 96
> >
> > ie a 64-byte buffer is simply TOO DAMN SMALL!
> >
> > Now, the thing is, the 64 comes from "sizeof(struct request_sense)", which
> > is what "struct packet_command *" uses. We can change that sizeof() to
> > just use SCSI_SENSE_BUFFERSIZE, but that still makes me worry about
>
> Building a kernel with that change on top of the other ones atm.
>
Changing the sizeof() to SCSI_SENSE_BUFFERSIZE doesn't fix it :

Slab corruption: start=f79da5a8, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934db>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f79da55c, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c0158918>](__vmalloc_node+0x68/0x80)
000: d0 1e 1e c3 18 1f 1e c3 60 1f 1e c3 a8 1f 1e c3
010: f0 1f 1e c3 38 20 1e c3 80 20 1e c3 c8 20 1e c3
Next obj: start=f79da5f4, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c0173923>](real_lookup+0x93/0xe0)
000: 6c 69 62 62 6f 6f 73 74 5f 70 72 67 5f 65 78 65
010: 63 5f 6d 6f 6e 69 74 6f 72 2d 67 63 63 2d 6d 74
Slab corruption: start=f79da5a8, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934db>](sr_do_ioctl+0x11b/0x270)
000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f79da55c, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c0158918>](__vmalloc_node+0x68/0x80)
000: d0 1e 1e c3 18 1f 1e c3 60 1f 1e c3 a8 1f 1e c3
010: f0 1f 1e c3 38 20 1e c3 80 20 1e c3 c8 20 1e c3
Next obj: start=f79da5f4, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c0173923>](real_lookup+0x93/0xe0)
000: 6c 69 62 62 6f 6f 73 74 5f 70 72 67 5f 65 78 65
010: 63 5f 6d 6f 6e 69 74 6f 72 2d 67 63 63 2d 6d 74

I'll now go test the things Jens suggested. Expect more feedback shortly.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 20:56:29

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Jesper Juhl <[email protected]> wrote:
> On 3/6/06, Jesper Juhl <[email protected]> wrote:
> > On Monday 06 March 2006 21:06, Linus Torvalds wrote:
> > >
> <...snip...>
> > > and the scary thing is that "len=64".
> > >
> > > The thing is, SCSI uses "SCSI_SENSE_BUFFERSIZE" to determine the maximum
> > > sense size to copy, and what do we have, if not
> > >
> > > include/scsi/scsi_cmnd.h:#define SCSI_SENSE_BUFFERSIZE 96
> > >
> > > ie a 64-byte buffer is simply TOO DAMN SMALL!
> > >
> > > Now, the thing is, the 64 comes from "sizeof(struct request_sense)", which
> > > is what "struct packet_command *" uses. We can change that sizeof() to
> > > just use SCSI_SENSE_BUFFERSIZE, but that still makes me worry about
> >
> > Building a kernel with that change on top of the other ones atm.
> >
> Changing the sizeof() to SCSI_SENSE_BUFFERSIZE doesn't fix it :
>
> Slab corruption: start=f79da5a8, len=64

Hmm, is it just me or should that len= have read len=96 ???

This is the change I made :

--- linux-2.6.16-rc5-mm2/block/scsi_ioctl.c~ 2006-03-06
21:43:56.000000000 +0100
+++ linux-2.6.16-rc5-mm2/block/scsi_ioctl.c 2006-03-06
21:43:56.000000000 +0100
@@ -568,7 +568,7 @@ int scsi_cmd_ioctl(struct file *file, st
hdr.dxferp = cgc.buffer;
hdr.sbp = cgc.sense;
if (hdr.sbp)
- hdr.mx_sb_len = sizeof(struct request_sense);
+ hdr.mx_sb_len = SCSI_SENSE_BUFFERSIZE;
hdr.timeout = cgc.timeout;
hdr.cmdp = ((struct cdrom_generic_command
__user*) arg)->cmd;
hdr.cmd_len = sizeof(cgc.cmd);

did I mess up?


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 21:08:08

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Mon, 6 Mar 2006, Jesper Juhl wrote:
>
> Hmm, is it just me or should that len= have read len=96 ???
>
> This is the change I made :
>
> --- linux-2.6.16-rc5-mm2/block/scsi_ioctl.c~ 2006-03-06
> 21:43:56.000000000 +0100
> +++ linux-2.6.16-rc5-mm2/block/scsi_ioctl.c 2006-03-06
> 21:43:56.000000000 +0100
> @@ -568,7 +568,7 @@ int scsi_cmd_ioctl(struct file *file, st
> hdr.dxferp = cgc.buffer;
> hdr.sbp = cgc.sense;
> if (hdr.sbp)
> - hdr.mx_sb_len = sizeof(struct request_sense);
> + hdr.mx_sb_len = SCSI_SENSE_BUFFERSIZE;
> hdr.timeout = cgc.timeout;
> hdr.cmdp = ((struct cdrom_generic_command __user*) arg)->cmd;
> hdr.cmd_len = sizeof(cgc.cmd);
>
> did I mess up?

That's not the one to change. It's the one in "sr_do_ioctl()", where it
uses "sizeof(*sense)".

Linus

----
diff --git a/drivers/scsi/sr_ioctl.c b/drivers/scsi/sr_ioctl.c
index 5d02ff4..b65462f 100644
--- a/drivers/scsi/sr_ioctl.c
+++ b/drivers/scsi/sr_ioctl.c
@@ -192,7 +192,7 @@ int sr_do_ioctl(Scsi_CD *cd, struct pack
SDev = cd->device;

if (!sense) {
- sense = kmalloc(sizeof(*sense), GFP_KERNEL);
+ sense = kmalloc(SCSI_SENSE_BUFFERSIZE, GFP_KERNEL);
if (!sense) {
err = -ENOMEM;
goto out;

2006-03-06 21:14:41

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Jens Axboe <[email protected]> wrote:
<snip>
>
> This is the one:
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=114041855331295&w=2
>
> Also an -mm report, btw. Does this reproduce with 2.6.16-rcX latest?
>

I just build and booted a2.6.16-rc5-git8 with the same config that I
used for the -mm kernel and the problem did not manifest itself there.
So it seems that mainline is fine but we need to find the bug before
it propagates from mm to mainline.

I'll test a -mm kernel without slab debug/poison now to see it it goes Oops.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 21:17:01

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Linus Torvalds <[email protected]> wrote:
>
>
> On Mon, 6 Mar 2006, Jesper Juhl wrote:
> >
> > Hmm, is it just me or should that len= have read len=96 ???
> >
> > This is the change I made :
> >
> > --- linux-2.6.16-rc5-mm2/block/scsi_ioctl.c~ 2006-03-06
> > 21:43:56.000000000 +0100
> > +++ linux-2.6.16-rc5-mm2/block/scsi_ioctl.c 2006-03-06
> > 21:43:56.000000000 +0100
> > @@ -568,7 +568,7 @@ int scsi_cmd_ioctl(struct file *file, st
> > hdr.dxferp = cgc.buffer;
> > hdr.sbp = cgc.sense;
> > if (hdr.sbp)
> > - hdr.mx_sb_len = sizeof(struct request_sense);
> > + hdr.mx_sb_len = SCSI_SENSE_BUFFERSIZE;
> > hdr.timeout = cgc.timeout;
> > hdr.cmdp = ((struct cdrom_generic_command __user*) arg)->cmd;
> > hdr.cmd_len = sizeof(cgc.cmd);
> >
> > did I mess up?
>
> That's not the one to change. It's the one in "sr_do_ioctl()", where it
> uses "sizeof(*sense)".
>

Ahh, so I did mess up - whoops - I just grep'ed for "sizeof(struct
request_sense)" :-(

I'll try it again (with the correct change) in a moment, after I've
tested Jens's "does no slab poison/debug make it go Oops" question...

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 21:41:10

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Jens Axboe <[email protected]> wrote:
[...snip...]
>
> I don't see how it could be, honestly, we would gladly oops in locally
> close places if that was the case. If you disable slab debug/poison, do
> you get a nice NULL pointer dereference instead? There have been some
> reports on a NULL queue for sr devices as of lately, I wonder if some
> SCSI change recently was broken.
>

I just build and booted a plain 2.6.16-rc5-mm2 with the same config
I've used previously, but with the following options disabled :

CONFIG_DEBUG_SLAB
CONFIG_PAGE_OWNER
CONFIG_DEBUG_VM
CONFIG_DEBUG_PAGEALLOC

The resulting kernel boots and runs just fine (no Oops) and leaves
nothing in dmesg.
So, without the debugging options it appears to the user that
everything is OK - nasty.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 21:54:29

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Linus Torvalds <[email protected]> wrote:
>
[.snip.]
>
> That's not the one to change. It's the one in "sr_do_ioctl()", where it
> uses "sizeof(*sense)".
>
> Linus
>
> ----
> diff --git a/drivers/scsi/sr_ioctl.c b/drivers/scsi/sr_ioctl.c
> index 5d02ff4..b65462f 100644
> --- a/drivers/scsi/sr_ioctl.c
> +++ b/drivers/scsi/sr_ioctl.c
> @@ -192,7 +192,7 @@ int sr_do_ioctl(Scsi_CD *cd, struct pack
> SDev = cd->device;
>
> if (!sense) {
> - sense = kmalloc(sizeof(*sense), GFP_KERNEL);
> + sense = kmalloc(SCSI_SENSE_BUFFERSIZE, GFP_KERNEL);
> if (!sense) {
> err = -ENOMEM;
> goto out;
>

Ok, booting a plain 2.6.16-rc5-mm2 kernel with the above being the
only change made results in this :

Slab corruption: start=f4f6a11c, len=128
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f4f6a090, len=128
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c01f4a26>](alloc_as_io_context+0x16/0xd0)
000: 01 00 00 00 00 00 00 00 ad 4e ad de ff ff ff ff
010: ff ff ff ff b0 49 1f c0 c0 49 1f c0 07 00 00 00
Next obj: start=f4f6a1a8, len=128
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c01f4a26>](alloc_as_io_context+0x16/0xd0)
000: 01 00 00 00 00 00 00 00 ad 4e ad de ff ff ff ff
010: ff ff ff ff b0 49 1f c0 c0 49 1f c0 07 00 00 00
Slab corruption: start=f4f6a11c, len=128
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934eb>](sr_do_ioctl+0x11b/0x270)
000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f4f6a090, len=128
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c01f4a26>](alloc_as_io_context+0x16/0xd0)
000: 01 00 00 00 00 00 00 00 ad 4e ad de ff ff ff ff
010: ff ff ff ff b0 49 1f c0 c0 49 1f c0 07 00 00 00
Next obj: start=f4f6a1a8, len=128
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c01f4a26>](alloc_as_io_context+0x16/0xd0)
000: 01 00 00 00 00 00 00 00 ad 4e ad de ff ff ff ff
010: ff ff ff ff b0 49 1f c0 c0 49 1f c0 07 00 00 00


Where do we go from here ?


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 21:55:48

by Dave Jones

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Mon, Mar 06, 2006 at 10:41:07PM +0100, Jesper Juhl wrote:

> CONFIG_DEBUG_SLAB
> CONFIG_PAGE_OWNER
> CONFIG_DEBUG_VM
> CONFIG_DEBUG_PAGEALLOC
>
> The resulting kernel boots and runs just fine (no Oops) and leaves
> nothing in dmesg.
> So, without the debugging options it appears to the user that
> everything is OK - nasty.

DEBUG_PAGEALLOC in particular is *fantastic* at making bugs hide.
I've lost many an hour trying to pin bugs down due to that.

Dave


--
http://www.codemonkey.org.uk

2006-03-06 21:57:15

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Dave Jones <[email protected]> wrote:
> On Mon, Mar 06, 2006 at 10:41:07PM +0100, Jesper Juhl wrote:
>
> > CONFIG_DEBUG_SLAB
> > CONFIG_PAGE_OWNER
> > CONFIG_DEBUG_VM
> > CONFIG_DEBUG_PAGEALLOC
> >
> > The resulting kernel boots and runs just fine (no Oops) and leaves
> > nothing in dmesg.
> > So, without the debugging options it appears to the user that
> > everything is OK - nasty.
>
> DEBUG_PAGEALLOC in particular is *fantastic* at making bugs hide.
> I've lost many an hour trying to pin bugs down due to that.
>

Well, in this case, turning the option *off* hides the bug ;)

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 22:07:45

by Andrew Morton

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

"Jesper Juhl" <[email protected]> wrote:
>
> Where do we go from here ?
>

If you can test just

2.6.16-rc5 + linus.patch + git-scsi-misc.patch

then we'd have a clearer idea.

2006-03-06 22:08:39

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Andrew Morton <[email protected]> wrote:
> "Jesper Juhl" <[email protected]> wrote:
> >
> > Where do we go from here ?
> >
>
> If you can test just
>
> 2.6.16-rc5 + linus.patch + git-scsi-misc.patch
>
> then we'd have a clearer idea.
>
Sure, I'll get right on it.
I'll post the results in 15min or so.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 22:18:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Mon, 6 Mar 2006, Jesper Juhl wrote:
>
> Ok, booting a plain 2.6.16-rc5-mm2 kernel with the above being the
> only change made results in this :

Yeah. I'm not surprised. A real mode-sense shouldn't be even 64 bytes,
much less 96, so it shouldn't have overflowed, and we had no indication of
the corruption spreading past the one allocation anyway.

It did/does seem a bug, though, so worth checking.

So onward in our tireless battle. Does this patch make any difference for
you? It does two things:

- it clears the "->sense" buffer in blk_end_sync_rq() (since it won't be
valid any more: the request is gone)
- it adds a BUG_ON() if we appear to have already done the sense fill on
SCSI IO completion, and do it again.

Now, I've not tried either of these, and the BUG_ON() in particular might
be a false positive itself, but it might be worth testing.

Linus

---
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 03d9c82..4351d34 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -2637,6 +2637,7 @@ void blk_end_sync_rq(struct request *rq,
struct completion *waiting = rq->waiting;

rq->waiting = NULL;
+ rq->sense = NULL;
__blk_put_request(rq->q, rq);

/*
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 701a328..2b60769 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -961,6 +961,10 @@ void scsi_io_completion(struct scsi_cmnd
if (result) {
clear_errors = 0;
if (sense_valid && req->sense) {
+
+ /* Have we already filled the sense buffer? */
+ BUG_ON(req->sense_len);
+
/*
* SG_IO wants current and deferred errors
*/

2006-03-06 22:27:14

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Jesper Juhl <[email protected]> wrote:
> On 3/6/06, Andrew Morton <[email protected]> wrote:
> > "Jesper Juhl" <[email protected]> wrote:
> > >
> > > Where do we go from here ?
> > >
> >
> > If you can test just
> >
> > 2.6.16-rc5 + linus.patch + git-scsi-misc.patch
> >
> > then we'd have a clearer idea.
> >
> Sure, I'll get right on it.
> I'll post the results in 15min or so.
>

Ok, a plain 2.6.15-rc5 + linus.patch + git-scsi-misc.patch results in this :

Slab corruption: start=f4812d14, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c028f61b>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f4812cc8, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<00000000>](_stext+0x3feffd68/0x8)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f4812d60, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c02367ef>](init_dev+0x5cf/0x630)
000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Slab corruption: start=f4812d14, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c028f61b>](sr_do_ioctl+0x11b/0x270)
000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f4812cc8, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<00000000>](_stext+0x3feffd68/0x8)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f4812d60, len=64
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c02367ef>](init_dev+0x5cf/0x630)
000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 22:35:08

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



Ok,
I have a new favorite suspect.

It is this one: commit 4d268eba1187ef66844a6a33b9431e5d0dadd4ad:

[PATCH] slab: extract slab order calculation to separate function

This patch moves the ugly loop that determines the 'optimal' size (page order)
of cache slabs from kmem_cache_create() to a separate function and cleans it
up a bit.

Thanks to Matthew Wilcox for the help with this patch.

Signed-off-by: Matthew Dobson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

and I think it may be broken.

In particular, as far as I can tell, that

+ /* More than offslab_limit objects will cause problems */
+ if (flags & CFLGS_OFF_SLAB && cachep->num > offslab_limit)
+ break;

has been incorrectly translated for several reasons:

- we shouldn't check "cachep->num > offslab_limit". We should check just
"num > offslab_limit" (cachep->num is the _previous_ number we tested).

- when we do "break", we've already incremented "gfporder", and we should
correct for that.

Now, maybe I'm just off my rocker again (I've certainly been batting 0.000
so far, even if I think I've been finding real bugs). So who knows. But I
get the feeling that that patch is broken.

Either revert it, or try this (TOTALLY UNTESTED!!!) patch..

And hey, maybe I'm just crazy.

Linus

----
diff --git a/mm/slab.c b/mm/slab.c
index 2b0b151..1cca41d 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1628,25 +1628,22 @@ static inline size_t calculate_slab_orde
size_t size, size_t align, unsigned long flags)
{
size_t left_over = 0;
+ int gfporder;

- for (;; cachep->gfporder++) {
+ for (gfporder = 0 ; gfporder < MAX_GFP_ORDER; gfporder++) {
unsigned int num;
size_t remainder;

- if (cachep->gfporder > MAX_GFP_ORDER) {
- cachep->num = 0;
- break;
- }
-
- cache_estimate(cachep->gfporder, size, align, flags,
- &remainder, &num);
+ cache_estimate(gfporder, size, align, flags, &remainder, &num);
if (!num)
continue;
+
/* More than offslab_limit objects will cause problems */
- if (flags & CFLGS_OFF_SLAB && cachep->num > offslab_limit)
+ if ((flags & CFLGS_OFF_SLAB) && num > offslab_limit)
break;

cachep->num = num;
+ cachep->gfporder = gfporder;
left_over = remainder;

/*

2006-03-06 22:44:08

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Linus Torvalds <[email protected]> wrote:
>
>
> On Mon, 6 Mar 2006, Jesper Juhl wrote:
> >
> > Ok, booting a plain 2.6.16-rc5-mm2 kernel with the above being the
> > only change made results in this :
>
> Yeah. I'm not surprised. A real mode-sense shouldn't be even 64 bytes,
> much less 96, so it shouldn't have overflowed, and we had no indication of
> the corruption spreading past the one allocation anyway.
>
> It did/does seem a bug, though, so worth checking.
>
Well, hopefully the SCSI people can take a look at that as a seperate issue...


> So onward in our tireless battle. Does this patch make any difference for
> you? It does two things:
>
> - it clears the "->sense" buffer in blk_end_sync_rq() (since it won't be
> valid any more: the request is gone)
> - it adds a BUG_ON() if we appear to have already done the sense fill on
> SCSI IO completion, and do it again.
>
> Now, I've not tried either of these, and the BUG_ON() in particular might
> be a false positive itself, but it might be worth testing.
>

Unfortunately this doesn't seem to make a difference :

Slab corruption: start=f70c0770, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934fb>](sr_do_ioctl+0x11b/0x270)
000: 70 00 02 00 00 00 00 0a 00 00 00 00 3a 01 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f70c0724, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813e6>](free_fdtable_rcu+0x66/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f70c07bc, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813ee>](free_fdtable_rcu+0x6e/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Slab corruption: start=f70c0770, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c02934fb>](sr_do_ioctl+0x11b/0x270)
000: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00
010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Prev obj: start=f70c0724, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813e6>](free_fdtable_rcu+0x66/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=f70c07bc, len=64
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c01813ee>](free_fdtable_rcu+0x6e/0x150)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 22:52:05

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Linus Torvalds <[email protected]> wrote:
>
> Ok,
> I have a new favorite suspect.
>
Heh, you are coming up with stuff to test faster than I can build &
boot kernels ;-)
Which is good, we'll get to the bottom of this all the faster :)


> It is this one: commit 4d268eba1187ef66844a6a33b9431e5d0dadd4ad:
>
[--snip--]
>
> Now, maybe I'm just off my rocker again (I've certainly been batting 0.000
> so far, even if I think I've been finding real bugs). So who knows. But I
> get the feeling that that patch is broken.
>
> Either revert it, or try this (TOTALLY UNTESTED!!!) patch..
>
Hmm, that patch doesn't apply at all to 2.6.16-rc5-mm2 :/

patching file mm/slab.c
Hunk #1 FAILED at 1628.
1 out of 1 hunk FAILED -- saving rejects to file mm/slab.c.rej


$ cat mm/slab.c.rej
***************
*** 1628,1649 ****
size_t size, size_t align, unsigned long flags)
{
size_t left_over = 0;
- int gfporder;

- for (gfporder = 0 ; gfporder < MAX_GFP_ORDER; gfporder++) {
unsigned int num;
size_t remainder;

- cache_estimate(gfporder, size, align, flags, &remainder, &num);
if (!num)
continue;
-
/* More than offslab_limit objects will cause problems */
- if ((flags & CFLGS_OFF_SLAB) && num > offslab_limit)
break;

cachep->num = num;
- cachep->gfporder = gfporder;
left_over = remainder;

/*
--- 1628,1652 ----
size_t size, size_t align, unsigned long flags)
{
size_t left_over = 0;

+ for (;; cachep->gfporder++) {
unsigned int num;
size_t remainder;

+ if (cachep->gfporder > MAX_GFP_ORDER) {
+ cachep->num = 0;
+ break;
+ }
+
+ cache_estimate(cachep->gfporder, size, align, flags,
+ &remainder, &num);
if (!num)
continue;
/* More than offslab_limit objects will cause problems */
+ if (flags & CFLGS_OFF_SLAB && cachep->num > offslab_limit)
break;

cachep->num = num;
left_over = remainder;

/*



> And hey, maybe I'm just crazy.
>
Somehow I don't think that's the core problem here ;)


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 22:54:58

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Mon, 6 Mar 2006, Linus Torvalds wrote:
>
> Either revert it, or try this (TOTALLY UNTESTED!!!) patch..

Don't even bother with the untested patch.

> + for (gfporder = 0 ; gfporder < MAX_GFP_ORDER; gfporder++) {

At a minimum, this "<" needs to be "<=".

After that, it might even work. Not that I can convince me that the test
for "offslab_limit" ever even triggers, so..

Linus

2006-03-06 23:01:22

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/6/06, Linus Torvalds <[email protected]> wrote:
>
>
> On Mon, 6 Mar 2006, Linus Torvalds wrote:
> >
> > Either revert it, or try this (TOTALLY UNTESTED!!!) patch..
>
> Don't even bother with the untested patch.
>
> > + for (gfporder = 0 ; gfporder < MAX_GFP_ORDER; gfporder++) {
>
> At a minimum, this "<" needs to be "<=".
>
> After that, it might even work. Not that I can convince me that the test
> for "offslab_limit" ever even triggers, so..
>

Ehh, it's getting pretty clear that you are looking at
2.6.16-rc5-git<latest> and I'm using -mm here, since that code is not
present in mm/slab.c in 2.6.16-rc5-mm2 in anything near that form.

And since 2.6.16-rc5-git8 is not experiencing problems I'd suggest you
perhaps instead take a look at what's in -mm... That's where we need
to work (it seems) to find the bug...

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-06 23:08:12

by Andrew Morton

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

"Jesper Juhl" <[email protected]> wrote:
>
> And since 2.6.16-rc5-git8 is not experiencing problems I'd suggest you
> perhaps instead take a look at what's in -mm... That's where we need
> to work (it seems) to find the bug...

Yes, it's very probably something in git-scsi-misc.

drivers/block/cciss.c | 3
drivers/message/fusion/Kconfig | 1
drivers/message/fusion/mptbase.c | 72 -
drivers/message/fusion/mptbase.h | 14
drivers/message/fusion/mptfc.c | 4
drivers/message/fusion/mptlan.c | 5
drivers/message/fusion/mptsas.c | 196 ++
drivers/message/fusion/mptscsih.c | 2402 +-----------------------------------
drivers/message/fusion/mptscsih.h | 11
drivers/message/fusion/mptspi.c | 733 ++++++++++
drivers/scsi/53c700.c | 18
drivers/scsi/aacraid/aacraid.h | 5
drivers/scsi/aacraid/comminit.c | 1
drivers/scsi/aacraid/commsup.c | 14
drivers/scsi/aacraid/linit.c | 14
drivers/scsi/aha152x.c | 7
drivers/scsi/aic7xxx/aic79xx_core.c | 24
drivers/scsi/aic7xxx/aic7xxx_core.c | 24
drivers/scsi/aic7xxx/aic7xxx_osm.c | 45
drivers/scsi/aic7xxx/aic7xxx_osm.h | 5
drivers/scsi/hosts.c | 3
drivers/scsi/ipr.c | 109 +
drivers/scsi/ips.c | 2
drivers/scsi/jazz_esp.c | 19
drivers/scsi/lpfc/lpfc.h | 40
drivers/scsi/lpfc/lpfc_attr.c | 162 +-
drivers/scsi/lpfc/lpfc_crtn.h | 28
drivers/scsi/lpfc/lpfc_ct.c | 74 -
drivers/scsi/lpfc/lpfc_disc.h | 19
drivers/scsi/lpfc/lpfc_els.c | 772 +++++++----
drivers/scsi/lpfc/lpfc_hbadisc.c | 492 +++----
drivers/scsi/lpfc/lpfc_hw.h | 65
drivers/scsi/lpfc/lpfc_init.c | 265 ++-
drivers/scsi/lpfc/lpfc_mbox.c | 33
drivers/scsi/lpfc/lpfc_nportdisc.c | 374 +++--
drivers/scsi/lpfc/lpfc_scsi.c | 25
drivers/scsi/lpfc/lpfc_scsi.h | 5
drivers/scsi/lpfc/lpfc_sli.c | 385 +++--
drivers/scsi/lpfc/lpfc_sli.h | 5
drivers/scsi/lpfc/lpfc_version.h | 6
drivers/scsi/ncr53c8xx.c | 127 -
drivers/scsi/ncr53c8xx.h | 37
drivers/scsi/osst.c | 24
drivers/scsi/qla2xxx/qla_def.h | 6
drivers/scsi/qla2xxx/qla_gbl.h | 2
drivers/scsi/qla2xxx/qla_isr.c | 7
drivers/scsi/qla2xxx/qla_mbx.c | 4
drivers/scsi/qla2xxx/qla_os.c | 89 -
drivers/scsi/qla2xxx/qla_sup.c | 4
drivers/scsi/scsi.c | 6
drivers/scsi/scsi_debug.c | 9
drivers/scsi/scsi_ioctl.c | 3
drivers/scsi/scsi_lib.c | 76 -
drivers/scsi/scsi_scan.c | 100 -
drivers/scsi/scsi_sysfs.c | 4
drivers/scsi/scsi_transport_fc.c | 9
drivers/scsi/scsi_transport_iscsi.c | 3
drivers/scsi/scsi_transport_sas.c | 258 +++
drivers/scsi/scsi_transport_spi.c | 83 -
drivers/scsi/sd.c | 11
drivers/scsi/sg.c | 16
drivers/scsi/sr.c | 5
drivers/scsi/sr_ioctl.c | 6
drivers/scsi/st.c | 29
drivers/scsi/sym53c8xx_2/sym_hipd.c | 53
include/linux/workqueue.h | 6
include/scsi/scsi.h | 2
include/scsi/scsi_cmnd.h | 20
include/scsi/scsi_device.h | 16
include/scsi/scsi_transport_sas.h | 22
include/scsi/scsi_transport_spi.h | 4
kernel/workqueue.c | 29
72 files changed, 3444 insertions(+), 4107 deletions(-)

2006-03-06 23:24:46

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/7/06, Andrew Morton <[email protected]> wrote:
> "Jesper Juhl" <[email protected]> wrote:
> >
> > And since 2.6.16-rc5-git8 is not experiencing problems I'd suggest you
> > perhaps instead take a look at what's in -mm... That's where we need
> > to work (it seems) to find the bug...
>
> Yes, it's very probably something in git-scsi-misc.
>
I would say that's correct. I just build 2.6.16-rc5-mm2 with just
git-scsi-misc.patch reverted, and that makes the problem go away.

So now the big question is; what part(s) of git-scsi-misc is broken?

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-07 00:18:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Tue, 7 Mar 2006, Jesper Juhl wrote:
>
> On 3/7/06, Andrew Morton <[email protected]> wrote:
> > "Jesper Juhl" <[email protected]> wrote:
> > >
> > > And since 2.6.16-rc5-git8 is not experiencing problems I'd suggest you
> > > perhaps instead take a look at what's in -mm... That's where we need
> > > to work (it seems) to find the bug...
> >
> > Yes, it's very probably something in git-scsi-misc.
> >
> I would say that's correct. I just build 2.6.16-rc5-mm2 with just
> git-scsi-misc.patch reverted, and that makes the problem go away.

Ok. I was kind of hoping that it was just a more reliable case of the
corruption that Andrew had been seeing too (which seems to be hard to
trigger in mainline too, but might exist there).

> So now the big question is; what part(s) of git-scsi-misc is broken?

Well, its origin is actually a git tree, so you could try the "git bisect"
approach using the

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git

tree that the patch comes from..

Linus

2006-03-07 00:25:38

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/7/06, Linus Torvalds <[email protected]> wrote:
>
>
> On Tue, 7 Mar 2006, Jesper Juhl wrote:
> >
> > On 3/7/06, Andrew Morton <[email protected]> wrote:
> > > "Jesper Juhl" <[email protected]> wrote:
> > > >
> > > > And since 2.6.16-rc5-git8 is not experiencing problems I'd suggest you
> > > > perhaps instead take a look at what's in -mm... That's where we need
> > > > to work (it seems) to find the bug...
> > >
> > > Yes, it's very probably something in git-scsi-misc.
> > >
> > I would say that's correct. I just build 2.6.16-rc5-mm2 with just
> > git-scsi-misc.patch reverted, and that makes the problem go away.
>
> Ok. I was kind of hoping that it was just a more reliable case of the
> corruption that Andrew had been seeing too (which seems to be hard to
> trigger in mainline too, but might exist there).
>
> > So now the big question is; what part(s) of git-scsi-misc is broken?
>
> Well, its origin is actually a git tree, so you could try the "git bisect"
> approach using the
>
> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git
>
> tree that the patch comes from..
>

I'll give that a go tomorrow - right now I need to get some sleep.
If there are other things to try, then just drop me a mail and I'll
test it tomorrow.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-07 03:15:27

by Mike Christie

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Jesper Juhl wrote:
> On 3/7/06, Andrew Morton <[email protected]> wrote:
>
>>"Jesper Juhl" <[email protected]> wrote:
>>
>>>And since 2.6.16-rc5-git8 is not experiencing problems I'd suggest you
>>> perhaps instead take a look at what's in -mm... That's where we need
>>> to work (it seems) to find the bug...
>>
>>Yes, it's very probably something in git-scsi-misc.
>>
>
> I would say that's correct. I just build 2.6.16-rc5-mm2 with just
> git-scsi-misc.patch reverted, and that makes the problem go away.
>
> So now the big question is; what part(s) of git-scsi-misc is broken?
>

Is it relate to this change?

diff --git a/drivers/scsi/sr_ioctl.c b/drivers/scsi/sr_ioctl.c
index 5d02ff4..03fbc4b 100644
--- a/drivers/scsi/sr_ioctl.c
+++ b/drivers/scsi/sr_ioctl.c
@@ -44,11 +44,10 @@ static int sr_read_tochdr(struct cdrom_d
int result;
unsigned char *buffer;

- buffer = kmalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
+ buffer = kzalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
if (!buffer)
return -ENOMEM;

- memset(&cgc, 0, sizeof(struct packet_command)); cgc.timeout = IOCTL_TIMEOUT;
cgc.cmd[0] = GPCMD_READ_TOC_PMA_ATIP; cgc.cmd[8] = 12; /* LSB of length */
@@ -74,11 +73,10 @@ static int sr_read_tocentry(struct cdrom
int result;
unsigned char *buffer;

- buffer = kmalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
+ buffer = kzalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
if (!buffer)
return -ENOMEM;

- memset(&cgc, 0, sizeof(struct packet_command));




When someone converted the *buffer* allocation to kzalloc they
also removed the the memset for the *packet_cmmand* struct.

The

memset(&cgc, 0, sizeof(struct packet_command));

should be added back I think.

2006-03-07 03:20:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Mon, 6 Mar 2006, Mike Christie wrote:
> - buffer = kmalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
> + buffer = kzalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
> if (!buffer)
> return -ENOMEM;
>
> - memset(&cgc, 0, sizeof(struct packet_command));
...
> - buffer = kmalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
> + buffer = kzalloc(32, GFP_KERNEL | SR_GFP_DMA(cd));
> if (!buffer)
> return -ENOMEM;
>
> - memset(&cgc, 0, sizeof(struct packet_command));

> When someone converted the *buffer* allocation to kzalloc they
> also removed the the memset for the *packet_cmmand* struct.
>
> The
>
> memset(&cgc, 0, sizeof(struct packet_command));
>
> should be added back I think.

Good eyes. I bet that's it.

Linus

2006-03-07 08:48:04

by Pekka Enberg

[permalink] [raw]
Subject: [PATCH] slab: fix offslab_limit in calculate_slab_order (Was: Slab corruption in 2.6.16-rc5-mm2)

On Mon, 6 Mar 2006, Linus Torvalds wrote:
> In particular, as far as I can tell, that
>
> + /* More than offslab_limit objects will cause problems */
> + if (flags & CFLGS_OFF_SLAB && cachep->num > offslab_limit)
> + break;
>
> has been incorrectly translated for several reasons:
>
> - we shouldn't check "cachep->num > offslab_limit". We should check just
> "num > offslab_limit" (cachep->num is the _previous_ number we tested).
>
> - when we do "break", we've already incremented "gfporder", and we should
> correct for that.
>
> Now, maybe I'm just off my rocker again (I've certainly been batting 0.000
> so far, even if I think I've been finding real bugs). So who knows. But I
> get the feeling that that patch is broken.
>
> Either revert it, or try this (TOTALLY UNTESTED!!!) patch..
>
> And hey, maybe I'm just crazy.

No you're not, it's broken. However, I think you're forgetting to reset
cachep->num when we go over MAX_GFP_ORDER, no? This patch boots although
I don't hit offslab limit.

Pekka

[PATCH] slab: fix offslab_limit in calculate_slab_order

From: Linus Torvalds <[email protected]>

The commit "[PATCH] slab: extract slab order calculation to separate
function" broke offslab_limit handling:

- We should check for num instead of cachep->num because the latter
is the number of objects for the _previous_ gfp order.

- After hitting the offslab_limit, we need to correct gfporder and
calculate left_over and cachep->num for that before exiting. We
do this by keeping current state in local variables and previous
state in cachep.

Linus spotted the problem and wrote the patch. I fixed gfporder
resetting when we go over MAX_GFP_ORDER and tested it with UM.

Cc: Andrew Morton <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
---

diff --git a/mm/slab.c b/mm/slab.c
index add05d8..fe63479 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1628,25 +1628,22 @@ static inline size_t calculate_slab_orde
size_t size, size_t align, unsigned long flags)
{
size_t left_over = 0;
+ int gfporder;

- for (;; cachep->gfporder++) {
+ for (gfporder = 0 ; gfporder <= MAX_GFP_ORDER; gfporder++) {
unsigned int num;
size_t remainder;

- if (cachep->gfporder > MAX_GFP_ORDER) {
- cachep->num = 0;
- break;
- }
-
- cache_estimate(cachep->gfporder, size, align, flags,
- &remainder, &num);
+ cache_estimate(gfporder, size, align, flags, &remainder, &num);
if (!num)
continue;
+
/* More than offslab_limit objects will cause problems */
- if (flags & CFLGS_OFF_SLAB && cachep->num > offslab_limit)
+ if ((flags & CFLGS_OFF_SLAB) && num > offslab_limit)
break;

cachep->num = num;
+ cachep->gfporder = gfporder;
left_over = remainder;

/*
@@ -1660,6 +1657,9 @@ static inline size_t calculate_slab_orde
/* Acceptable internal fragmentation */
break;
}
+ if (cachep->gfporder > MAX_GFP_ORDER)
+ cachep->num = 0;
+
return left_over;
}

2006-03-07 17:13:29

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] slab: fix offslab_limit in calculate_slab_order (Was: Slab corruption in 2.6.16-rc5-mm2)



On Tue, 7 Mar 2006, Pekka J Enberg wrote:
>
> No you're not, it's broken. However, I think you're forgetting to reset
> cachep->num when we go over MAX_GFP_ORDER, no?

No, we only ever set "cachep->num" for something that we've decided is
valid.

"gfporder" can never be > MAX_GFP_ORDER inside the loop, because we just
iterate between 0..MAX_GFP_ORDER.

Linus

2006-03-07 18:02:16

by James Bottomley

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Mon, 2006-03-06 at 19:20 -0800, Linus Torvalds wrote:
> > should be added back I think.
>
> Good eyes. I bet that's it.

Yes, well done. Do we have confirmation yet that reversing this fixes
the bug?

I think a full reversal is in order, since buffer is a quantity being
written to, there's no point in zeroing it.

[Note to self: must do better in checking janitors patches]

James


2006-03-07 19:21:40

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] slab: fix offslab_limit in calculate_slab_order (Was: Slab corruption in 2.6.16-rc5-mm2)

On Tue, 7 Mar 2006, Pekka J Enberg wrote:
> > No you're not, it's broken. However, I think you're forgetting to reset
> > cachep->num when we go over MAX_GFP_ORDER, no?

On Tue, 2006-03-07 at 09:12 -0800, Linus Torvalds wrote:
> No, we only ever set "cachep->num" for something that we've decided is
> valid.
>
> "gfporder" can never be > MAX_GFP_ORDER inside the loop, because we just
> iterate between 0..MAX_GFP_ORDER.

I don't think that's true. We set cachep->num to something we think is
valid but check for internal fragmentation later. So I think we can get
out of the loop with cachep->num initialized to non-zero but gfporder
set to MAX_GFP_ORDER, no? Or did I forget to take my medicine this
morning...?

Pekka

2006-03-07 19:40:40

by Jesper Juhl

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On 3/7/06, James Bottomley <[email protected]> wrote:
> On Mon, 2006-03-06 at 19:20 -0800, Linus Torvalds wrote:
> > > should be added back I think.
> >
> > Good eyes. I bet that's it.
>
> Yes, well done. Do we have confirmation yet that reversing this fixes
> the bug?
>

I just tried reverting that bit only from 2.6.16-rc5-mm2, and it does
indeed fix the problem.

Thanks for spotting that Mike.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-08 06:30:55

by Chuck Ebbert

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

In-Reply-To: <[email protected]>

On Mon, 6 Mar 2006 19:20:13 -0800, Linus Torvalds wrote:

> > When someone converted the *buffer* allocation to kzalloc they
> > also removed the the memset for the *packet_cmmand* struct.
> >
> > The
> >
> > memset(&cgc, 0, sizeof(struct packet_command));
> >
> > should be added back I think.
>
> Good eyes. I bet that's it.

Heh. This exact fix was posted to linux-kernel by Lee Schermerhorn
three weeks ago:

Date: Wed, 15 Feb 2006 14:07:37 -0500
From: Lee Schermerhorn <[email protected]>
Subject: [PATCH] 2.6.16-rc3-mm1 - restore zeroing of packet_command
struct in sr_ioctl.c
To: linux-kernel <[email protected]>
Cc: Andrew Morton <[email protected]>
Message-ID: <[email protected]>


--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

2006-03-08 08:32:34

by Nick Piggin

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Chuck Ebbert wrote:
> In-Reply-To: <[email protected]>
>
> On Mon, 6 Mar 2006 19:20:13 -0800, Linus Torvalds wrote:
>
>
>>>When someone converted the *buffer* allocation to kzalloc they
>>>also removed the the memset for the *packet_cmmand* struct.
>>>
>>>The
>>>
>>>memset(&cgc, 0, sizeof(struct packet_command));
>>>
>>>should be added back I think.
>>
>>Good eyes. I bet that's it.
>
>
> Heh. This exact fix was posted to linux-kernel by Lee Schermerhorn
> three weeks ago:
>
> Date: Wed, 15 Feb 2006 14:07:37 -0500
> From: Lee Schermerhorn <[email protected]>
> Subject: [PATCH] 2.6.16-rc3-mm1 - restore zeroing of packet_command
> struct in sr_ioctl.c
> To: linux-kernel <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Message-ID: <[email protected]>
>
>

It isn't Andrew's job to make sure a patch gets to the right place
until it is safely in -mm, and even then he's not always going to
know the severity and importance unless he's told.

If it was a patch to "restore" a regression in behaviour, CCs should
at least have gone to the author of the patch that broke it, and the
subsystem maintainers / list / etc as well.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-03-08 08:49:32

by Andrew Morton

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Nick Piggin <[email protected]> wrote:
>
> Chuck Ebbert wrote:
> > In-Reply-To: <[email protected]>
> >
> > On Mon, 6 Mar 2006 19:20:13 -0800, Linus Torvalds wrote:
> >
> >
> >>>When someone converted the *buffer* allocation to kzalloc they
> >>>also removed the the memset for the *packet_cmmand* struct.
> >>>
> >>>The
> >>>
> >>>memset(&cgc, 0, sizeof(struct packet_command));
> >>>
> >>>should be added back I think.
> >>
> >>Good eyes. I bet that's it.
> >
> >
> > Heh. This exact fix was posted to linux-kernel by Lee Schermerhorn
> > three weeks ago:
> >
> > Date: Wed, 15 Feb 2006 14:07:37 -0500
> > From: Lee Schermerhorn <[email protected]>
> > Subject: [PATCH] 2.6.16-rc3-mm1 - restore zeroing of packet_command
> > struct in sr_ioctl.c
> > To: linux-kernel <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Message-ID: <[email protected]>
> >
> >
>
> It isn't Andrew's job to make sure a patch gets to the right place
> until it is safely in -mm, and even then he's not always going to
> know the severity and importance unless he's told.

Is too!

> If it was a patch to "restore" a regression in behaviour, CCs should
> at least have gone to the author of the patch that broke it, and the
> subsystem maintainers / list / etc as well.

I actually merged Lee's patch into -mm, copied James on it and then I
dropped it when I saw that it spat rejects against an updated version of
James's tree, assuming that it had been merged.

Often I'll check that a patch reverts successfully from the upstream tree
before dropping it, but for an obvious one like that I guess I didn't
bother, and assumed that James had taken it. Only he hadn't - instead he'd
gone and merged something else, hence the rejects. Oh well.

2006-03-08 09:03:27

by Nick Piggin

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Andrew Morton wrote:
> Nick Piggin <[email protected]> wrote:
>

>>It isn't Andrew's job to make sure a patch gets to the right place
>>until it is safely in -mm, and even then he's not always going to
>>know the severity and importance unless he's told.
>
>
> Is too!
>

OK, partially. As this case illustrates, everybody makes mistakes and
you obviously can't go back and verify you got all the patches because.

The guy who hits the bug and/or writes the patch can easily see it is
still not merged and shout.

>
>>If it was a patch to "restore" a regression in behaviour, CCs should
>>at least have gone to the author of the patch that broke it, and the
>>subsystem maintainers / list / etc as well.
>
>
> I actually merged Lee's patch into -mm, copied James on it and then I
> dropped it when I saw that it spat rejects against an updated version of
> James's tree, assuming that it had been merged.
>
> Often I'll check that a patch reverts successfully from the upstream tree
> before dropping it, but for an obvious one like that I guess I didn't
> bother, and assumed that James had taken it. Only he hadn't - instead he'd
> gone and merged something else, hence the rejects. Oh well.
>

You do a great job, but "push the work out to the end nodes", right?
That's how we get this network to scale. It is trivial for people to
verify their important patches have propogated as the release approaches.

(A little harder for part-timers who aren't in the loop about exactly
when the release will happen, thanks to our -ridiculous-count release
system, but still easy compared with your having to double check
everything).

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-03-08 09:15:27

by Andrew Morton

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Nick Piggin <[email protected]> wrote:
>
> > Often I'll check that a patch reverts successfully from the upstream tree
> > before dropping it, but for an obvious one like that I guess I didn't
> > bother, and assumed that James had taken it. Only he hadn't - instead he'd
> > gone and merged something else, hence the rejects. Oh well.
> >
>
> You do a great job, but "push the work out to the end nodes", right?
> That's how we get this network to scale. It is trivial for people to
> verify their important patches have propogated as the release approaches.
>
> (A little harder for part-timers who aren't in the loop about exactly
> when the release will happen, thanks to our -ridiculous-count release
> system, but still easy compared with your having to double check
> everything).

Well yes, Lee sent the fix to the guy who he got the kernel release from in
the reasonable expectation that I'd take care of getting it to where it
needed to be.

Problem is, a) I screwed up, b) James screwed up and c) someone just
happened to change those few lines of code in that place within a few-day
window.

That triple-combo doesn't happen very often.

2006-03-08 09:32:24

by Nick Piggin

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Andrew Morton wrote:

> Well yes, Lee sent the fix to the guy who he got the kernel release from in
> the reasonable expectation that I'd take care of getting it to where it
> needed to be.
>
> Problem is, a) I screwed up, b) James screwed up and c) someone just
> happened to change those few lines of code in that place within a few-day
> window.
>
> That triple-combo doesn't happen very often.
>

I guess what I'm advocating isn't foolproof either: the guy who wrote
the patch might die (knock on wood) ;)

Carry on.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-03-08 14:36:15

by Lee Schermerhorn

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Wed, 2006-03-08 at 20:23 +1100, Nick Piggin wrote:
> Andrew Morton wrote:
>
> > Well yes, Lee sent the fix to the guy who he got the kernel release from in
> > the reasonable expectation that I'd take care of getting it to where it
> > needed to be.
> >
> > Problem is, a) I screwed up, b) James screwed up and c) someone just
> > happened to change those few lines of code in that place within a few-day
> > window.
> >
> > That triple-combo doesn't happen very often.
> >
>
> I guess what I'm advocating isn't foolproof either: the guy who wrote
> the patch might die (knock on wood) ;)

Thanks, Nick. I'll remember that... See you at OLS?

Lee

P.S. ;-)

2006-03-08 16:09:33

by Bill Davidsen

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Linus Torvalds wrote:

> has been incorrectly translated for several reasons:
>
> - we shouldn't check "cachep->num > offslab_limit". We should check just
> "num > offslab_limit" (cachep->num is the _previous_ number we tested).
>
> - when we do "break", we've already incremented "gfporder", and we should
> correct for that.
>
> Now, maybe I'm just off my rocker again (I've certainly been batting 0.000
> so far, even if I think I've been finding real bugs). So who knows. But I
> get the feeling that that patch is broken.

I thought stumbling over bugs while looking for other things was part of
the new development model ;-)
>
> Either revert it, or try this (TOTALLY UNTESTED!!!) patch..
>
> And hey, maybe I'm just crazy.

Being crazy and being right are not mutually exclusize.
--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2006-03-09 15:50:23

by Martin Bligh

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Dave Jones wrote:
> On Mon, Mar 06, 2006 at 10:41:07PM +0100, Jesper Juhl wrote:
>
> > CONFIG_DEBUG_SLAB
> > CONFIG_PAGE_OWNER
> > CONFIG_DEBUG_VM
> > CONFIG_DEBUG_PAGEALLOC
> >
> > The resulting kernel boots and runs just fine (no Oops) and leaves
> > nothing in dmesg.
> > So, without the debugging options it appears to the user that
> > everything is OK - nasty.
>
> DEBUG_PAGEALLOC in particular is *fantastic* at making bugs hide.
> I've lost many an hour trying to pin bugs down due to that.

Is this backwards? We're saying DEBUG_PAGEALLOC is bad?

OK, what I'm going to try to do, given the recent comments re
CONFIG_DEBUG_SLAB and also PAGEALLOC is to arrange with Andy to run
a debug kernel as well as a normal kernel for every test, and then
we can publish the results on http://test.kernel.org. For now it'll be
a seperate matrix, until I work out how to fold the 3d cube nicely
into 2d - I know I have to do that anyway, so no big deal.

Do we NOT want to have DEBUG_SLAB and DEBUG_PAGEALLOC both enabled?
Running multiple permutations is going to get really painful on the
systems involved. Any other requests for what gets enabled (I really
want to just stick to one 'debug' setup if possible).

I have no idea why I didn't do this a year ago <slaps self>.

M.

2006-03-09 15:54:23

by Martin Bligh

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

>> DEBUG_PAGEALLOC in particular is *fantastic* at making bugs hide.
>> I've lost many an hour trying to pin bugs down due to that.
>
>
> Is this backwards? We're saying DEBUG_PAGEALLOC is bad?
>
> OK, what I'm going to try to do, given the recent comments re
> CONFIG_DEBUG_SLAB and also PAGEALLOC is to arrange with Andy to run
> a debug kernel as well as a normal kernel for every test, and then
> we can publish the results on http://test.kernel.org. For now it'll be
> a seperate matrix, until I work out how to fold the 3d cube nicely
> into 2d - I know I have to do that anyway, so no big deal.
>
> Do we NOT want to have DEBUG_SLAB and DEBUG_PAGEALLOC both enabled?
> Running multiple permutations is going to get really painful on the
> systems involved. Any other requests for what gets enabled (I really
> want to just stick to one 'debug' setup if possible).
>
> I have no idea why I didn't do this a year ago <slaps self>.

Oh, and I ported CONFIG_DEBUG_PAGEALLOC to x86_64 last week. Will send
out the patch later today.

M.

2006-03-09 16:00:10

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Thu, Mar 09, 2006 at 07:50:15AM -0800, Martin J. Bligh wrote:
> Do we NOT want to have DEBUG_SLAB and DEBUG_PAGEALLOC both enabled?
> Running multiple permutations is going to get really painful on the
> systems involved. Any other requests for what gets enabled (I really
> want to just stick to one 'debug' setup if possible).

Debug kernels are incredibly slow, making hitting certain races next to
impossible. By all means non-DEBUG kernels should definately be getting
tested.

-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[email protected]>.

2006-03-09 16:05:04

by Martin Bligh

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

Benjamin LaHaise wrote:
> On Thu, Mar 09, 2006 at 07:50:15AM -0800, Martin J. Bligh wrote:
>
>>Do we NOT want to have DEBUG_SLAB and DEBUG_PAGEALLOC both enabled?
>>Running multiple permutations is going to get really painful on the
>>systems involved. Any other requests for what gets enabled (I really
>>want to just stick to one 'debug' setup if possible).
>
>
> Debug kernels are incredibly slow, making hitting certain races next to
> impossible. By all means non-DEBUG kernels should definately be getting
> tested.

It'd still run a totally non-debug kernel as well - I want that for the
perf tests etc anyway. I guess the question is whether the debug kernels
should have most of the DEBUG_* turned on, or just a select few.

M.

2006-03-09 16:09:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2



On Thu, 9 Mar 2006, Martin J. Bligh wrote:
> >
> > DEBUG_PAGEALLOC in particular is *fantastic* at making bugs hide.
> > I've lost many an hour trying to pin bugs down due to that.
>
> Is this backwards? We're saying DEBUG_PAGEALLOC is bad?

DEBUG_PAGEALLOC is great for finding the really stupid kinds of bugs, and
it's definitely worth doing every once in a while.

However, DEBUG_PAGEALLOC makes many things orders of magnitude slower, and
it eats memory like mad (because it turns some slabs into whole pages -
but it still doesn't help small allocation debugging that much). So unlike
DEBUG_SLAB, it's not reasonable to have it on all the time.

IOW, DEBUG_SLAB is something that a distro kernel can reasonably enable
for users by default (I think fedora-devel does, for example). In
contrast, DEBUG_PAGEALLOC is more of a "useful for special cases" thing,
where you want to validate that there's nothing _obviously_ bad going on.

> Do we NOT want to have DEBUG_SLAB and DEBUG_PAGEALLOC both enabled?

I suspect that once DEBUG_PAGEALLOC is on, whether you do DEBUG_SLAB or
not is a toss-up. The interesting cases tend to be

- neither: usable for benchmarking
- DEBUG_SLAB: perfectly usable for normal work
- DEBUG_PAGEALLOC (with or without DEBUG_SLAB): debugging tool only

At least that's my opinion, maybe others have other experiences.

Linus

2006-03-09 16:42:19

by Dave Jones

[permalink] [raw]
Subject: Re: Slab corruption in 2.6.16-rc5-mm2

On Thu, Mar 09, 2006 at 08:08:52AM -0800, Linus Torvalds wrote:

> IOW, DEBUG_SLAB is something that a distro kernel can reasonably enable
> for users by default (I think fedora-devel does, for example).

Correct.

> - neither: usable for benchmarking
> - DEBUG_SLAB: perfectly usable for normal work
> - DEBUG_PAGEALLOC (with or without DEBUG_SLAB): debugging tool only
>
> At least that's my opinion, maybe others have other experiences.

That's pretty much my experience. I get people screaming at me when
I enable PAGEALLOC for a day or so in the Fedora-devel kernel if I'm
chasing something :)

Dave

--
http://www.codemonkey.org.uk