2003-11-25 14:00:36

by Prakash K. Cheemplavam

[permalink] [raw]
Subject: Silicon Image 3112A SATA trouble

I think it it high time to get the SIIMAGE.C 1.06 bugfree. It seems to
have some problems with my HD+SATA converter mainly performance wise. I
htink it is due to this error coming constanlty in dmesg unless I
comment it out in the sources:

hde: sata_error = 0x00000000, watchdog = 0, siimage_mmio_ide_dma_test_irq

What is the problem? I think it may be due to the fact that I have an
SiI3112A controller onboard and the driver detects it without the A
revision (just as SiI3112). And/or it is due to the fact that I
connected a PATA drive with a SATA converter to the controller.

Now with 2.6-test10 the performance got a bit better in comparison to
test9 and prior 2.6 kernels. Before it was max 22MB/sec and now it is
25mb/sec. But it is still far away from 2.4.22-ac4 kernel which managed
37mb/s, which is still bad in comparison to windows which reaches 50mb/s.

It is NOT a problem of read-ahead. I tried various hdparm parameters and
it didn't improve the situation. What makes the situation even worse:

When I try hdparm -d1 /dev/hde (though hdparm sates dma is already on)
the drive stops working and I get some lines of erorrs like drive-seek
error and some irq related stuff. So I have to push the button.

Someone else using a native SATA Maxtor on Sil3112 (dunno whether A or
not) has no problems, hdparm -d works as well and he gets 40mb/sec with
test10.

So what may be the problem, and how to get rid of it? (1. error message,
2. bad performance, 3. hdparm -d1 malfunctioning). 1 & 3 were also with
2.4.22-ac4 and 2 wasn't that bad, as stated above, so except 2 there is
no regression, but also no fix yet. Changing max_kb_per_request didn't
help either.

If you need more infos, just ask me.


Here the relevant part of dmesg:

SiI3112 Serial ATA: IDE controller at PCI slot 0000:01:0b.0
SiI3112 Serial ATA: chipset revision 2
SiI3112 Serial ATA: 100% native mode on irq 11
ide2: MMIO-DMA at 0xf9844000-0xf9844007, BIOS settings: hde:pio,
hdf:pio
ide3: MMIO-DMA at 0xf9844008-0xf984400f, BIOS settings: hdg:pio,
hdh:pio
hde: SAMSUNG SP1614N, ATA DISK drive
ide2 at 0xf9844080-0xf9844087,0xf984408a on irq 11
hde: max request size: 7KiB
hde: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63,
UDMA(100)
/dev/ide/host2/bus0/target0/lun0:<4>hde: sata_error = 0x00000000,
watchdog = 0, siimage_mmio_ide_dma_test_irq
p1 p2 p3 <<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p5<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p6<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p7<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p8<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p9 >



Here is hdparm -iI /dev/hde:



/dev/hde:

Model=SAMSUNG SP1614N, FwRev=TM100-24, SerialNo=0735J1FW702444
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=34902, SectSize=554, ECCbytes=4
BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2
AdvancedPM=no WriteCache=enabled
Drive conforms to: (null):

* signifies the current active mode


ATA device, with non-removable media
Model Number: SAMSUNG SP1614N
Serial Number: 0735J1FW702444
Firmware Revision: TM100-24
Standards:
Supported: 7 6 5 4
Likely used: 7
Configuration:
Logical max current
cylinders 16383 65535
heads 16 1
sectors/track 63 63
--
CHS current addressable sectors: 4128705
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 312581808
device size with M = 1024*1024: 152627 MBytes
device size with M = 1000*1000: 160041 MBytes (160 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 1
Standby timer values: spec'd by Standard, no device specific
minimum
R/W multiple sector transfer: Max = 16 Current = 16
Recommended acoustic management value: 254, current value: 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=240ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* READ BUFFER cmd
* WRITE BUFFER cmd
* Host Protected Area feature set
* Look-ahead
* Write cache
* Power Management feature set
Security Mode feature set
* SMART feature set
* FLUSH CACHE EXT command
* Mandatory FLUSH CACHE command
* Device Configuration Overlay feature set
* 48-bit Address feature set
Automatic Acoustic Management feature set
SET MAX security extension
* DOWNLOAD MICROCODE cmd
* SMART self-test
* SMART error logging
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
supported: enhanced erase
96min for SECURITY ERASE UNIT. 96min for ENHANCED SECURITY
ERASE UNIT.
HW reset results:
CBLID- below Vih
Device num = 0 determined by the jumper
Checksum: correct



And here the complete dmesg:

Linux version 2.6.0-gentoo (root@tachyon) (gcc-Version 3.3.2 20031022
(Gentoo Linux 3.3.2-r2, propolice)) #6 Tue Nov 25 14:40:13 CET 2003
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
On node 0 totalpages: 262128
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 225280 pages, LIFO batch:16
HighMem zone: 32752 pages, LIFO batch:7
DMI 2.2 present.
ACPI: RSDP (v000 Nvidia ) @ 0x000f6b60
ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x3fff3000
ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x3fff3040
ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x3fff79c0
ACPI: DSDT (v001 NVIDIA AWRDACPI 0x00001000 MSFT 0x0100000d) @ 0x00000000
Building zonelist for node : 0
Kernel command line: root=/dev/hde6 hdg=none vga=0x51A video=vesa:mtrr,ywrap
ide_setup: hdg=none
Initializing CPU#0
PID hash table entries: 4096 (order 12: 32768 bytes)
Detected 1904.513 MHz processor.
Console: colour dummy device 80x25
Memory: 1032280k/1048512k available (3128k kernel code, 15280k reserved,
1007k data, 160k init, 131008k highmem)
Calibrating delay loop... 3768.32 BogoMIPS
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
checking if image is initramfs...it isn't (ungzip failed); looks like an
initrd
Freeing initrd memory: 304k freed
CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000
CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: AMD Athlon(tm) stepping 01
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfb420, last bus=2
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20031002
ACPI: IRQ 9 was Edge Triggered, setting to Level Triggerd
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LAPU] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LFIR] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [L3CM] (IRQs 3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [APC1] (IRQs *16)
ACPI: PCI Interrupt Link [APC2] (IRQs 17)
ACPI: PCI Interrupt Link [APC3] (IRQs *18)
ACPI: PCI Interrupt Link [APC4] (IRQs *19)
ACPI: PCI Interrupt Link [APCE] (IRQs 16)
ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCI] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCS] (IRQs *23)
ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [AP3C] (IRQs 20 21 22)
ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22)
Linux Plug and Play Support v0.97 (c) Adam Belay
SCSI subsystem initialized
drivers/usb/core/usb.c: registered new driver usbfs
drivers/usb/core/usb.c: registered new driver hub
ACPI: PCI Interrupt Link [LSMB] enabled at IRQ 10
ACPI: PCI Interrupt Link [LUBA] enabled at IRQ 11
ACPI: PCI Interrupt Link [LUBB] enabled at IRQ 5
ACPI: PCI Interrupt Link [LUB2] enabled at IRQ 10
ACPI: PCI Interrupt Link [LMAC] enabled at IRQ 10
ACPI: PCI Interrupt Link [LAPU] enabled at IRQ 10
ACPI: PCI Interrupt Link [LACI] enabled at IRQ 5
ACPI: PCI Interrupt Link [LFIR] enabled at IRQ 11
ACPI: PCI Interrupt Link [LNK4] enabled at IRQ 11
ACPI: PCI Interrupt Link [LNK1] enabled at IRQ 5
ACPI: PCI Interrupt Link [LNK3] enabled at IRQ 11
PCI: Using ACPI for IRQ routing
PCI: if you experience problems, try using option 'pci=noacpi' or even
'acpi=off'
vesafb: framebuffer at 0xc0000000, mapped to 0xf8808000, size 16384k
vesafb: mode is 1280x1024x16, linelength=2560, pages=1
vesafb: protected mode interface info at c000:ea60
vesafb: scrolling: redraw
vesafb: directcolor: size=0:5:6:5, shift=0:11:5:0
fb0: VESA VGA frame buffer device
Machine check exception polling timer started.
IA-32 Microcode Update Driver: v1.13 <[email protected]>
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
apm: overridden by ACPI.
highmem bounce pool size: 64 pages
devfs: v1.22 (20021013) Richard Gooch ([email protected])
devfs: boot_options: 0x1
Installing knfsd (copyright (C) 1996 [email protected]).
NTFS driver 2.1.5 [Flags: R/W].
udf: registering filesystem
SGI XFS for Linux with large block numbers, no debug enabled
ACPI: Power Button (FF) [PWRF]
ACPI: Fan [FAN] (on)
ACPI: Processor [CPU0] (supports C1)
ACPI: Thermal Zone [THRM] (43 C)
bootsplash 3.1.3-2003/11/14: looking for picture.... silentjpeg size
155838 bytes, found (1280x1024, 155850 bytes, v3).
Console: switching to colour frame buffer device 153x58
pty: 256 Unix98 ptys configured
Real Time Clock Driver v1.12
Non-volatile memory driver v1.2
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
Using anticipatory io scheduler
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
loop: loaded (max 8 devices)
forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.18.
PCI: Setting latency timer of device 0000:00:04.0 to 64
eth0: forcedeth.c: subsystem: 0147b:1c00
Linux video capture interface: v1.00
DriverInitialize MAC address = ff:ff:ff:ff:ff:ff:00:00
DriverInitialize key =
ff ff ff ff
ff ff ff ff
ff ff ff ff
ff ff ff ff
DVB: registering new adapter (Technisat SkyStar2 driver).
DVB: registering frontend 0:0 (Zarlink MT312)...
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE2: IDE controller at PCI slot 0000:00:09.0
NFORCE2: chipset revision 162
NFORCE2: not 100% native mode: will probe irqs later
NFORCE2: 0000:00:09.0 (rev a2) UDMA133 controller
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
hda: _NEC DV-5800A, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: LITE-ON LTR-16102B, ATAPI CD/DVD-ROM drive
hdd: IOMEGA ZIP 100 ATAPI, ATAPI FLOPPY drive
ide1 at 0x170-0x177,0x376 on irq 15
SiI3112 Serial ATA: IDE controller at PCI slot 0000:01:0b.0
SiI3112 Serial ATA: chipset revision 2
SiI3112 Serial ATA: 100% native mode on irq 11
ide2: MMIO-DMA at 0xf9844000-0xf9844007, BIOS settings: hde:pio,
hdf:pio
ide3: MMIO-DMA at 0xf9844008-0xf984400f, BIOS settings: hdg:pio,
hdh:pio
hde: SAMSUNG SP1614N, ATA DISK drive
ide2 at 0xf9844080-0xf9844087,0xf984408a on irq 11
hde: max request size: 7KiB
hde: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63,
UDMA(100)
/dev/ide/host2/bus0/target0/lun0:<4>hde: sata_error = 0x00000000,
watchdog = 0, siimage_mmio_ide_dma_test_irq
p1 p2 p3 <<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p5<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p6<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p7<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p8<4>hde: sata_error = 0x00000000, watchdog = 0,
siimage_mmio_ide_dma_test_irq
p9 >
hda: ATAPI 48X DVD-ROM drive, 512kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12
hdc: ATAPI 40X CD-ROM CD-R/RW drive, 2048kB Cache, DMA
ide-floppy driver 0.99.newide
hdd: No disk in drive
hdd: 98304kB, 32/64/96 CHS, 4096 kBps, 512 sector size, 2941 rpm
ohci1394: $Rev: 1045 $ Ben Collins <[email protected]>
PCI: Setting latency timer of device 0000:00:0d.0 to 64
ohci1394_0: OHCI-1394 1.1 (PCI): IRQ=[11] MMIO=[cc084000-cc0847ff] Max
Packet=[2048]
ohci1394_0: SelfID received outside of bus reset sequence
video1394: Installed video1394 module
raw1394: /dev/raw1394 device initialized
Console: switching to colour frame buffer device 153x58
ehci_hcd 0000:00:02.2: EHCI Host Controller
PCI: Setting latency timer of device 0000:00:02.2 to 64
ehci_hcd 0000:00:02.2: irq 10, pci mem f984c000
ehci_hcd 0000:00:02.2: new USB bus registered, assigned bus number 1
PCI: cache line size of 64 is not supported by device 0000:00:02.2
ehci_hcd 0000:00:02.2: USB 2.0 enabled, EHCI 1.00, driver 2003-Jun-13
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
ohci_hcd: 2003 Oct 13 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ohci_hcd: block sizes: ed 64 td 64
ohci_hcd 0000:00:02.0: OHCI Host Controller
PCI: Setting latency timer of device 0000:00:02.0 to 64
ohci_hcd 0000:00:02.0: irq 11, pci mem f984e000
ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
ohci_hcd 0000:00:02.1: OHCI Host Controller
PCI: Setting latency timer of device 0000:00:02.1 to 64
ohci_hcd 0000:00:02.1: irq 5, pci mem f9850000
ohci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 3 ports detected
drivers/usb/host/uhci-hcd.c: USB Universal Host Controller Interface
driver v2.1
drivers/usb/core/usb.c: registered new driver usblp
drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
Initializing USB Mass Storage driver...
drivers/usb/core/usb.c: registered new driver usb-storage
USB Mass Storage support registered.
drivers/usb/core/usb.c: registered new driver usbscanner
drivers/usb/image/scanner.c: 0.4.15:USB Scanner Driver
mice: PS/2 mouse device common for all mice
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
serio: i8042 AUX port at 0x60,0x64 irq 12
input: AT Translated Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
I2O Core - (C) Copyright 1999 Red Hat Software
I2O: Event thread created as pid 15
i2o: Checking for PCI I2O controllers...
I2O configuration manager v 0.04.
(C) Copyright 1999 Red Hat Software
i2c /dev entries driver
i2c_adapter i2c-0: nForce2 SMBus adapter at 0x5000
i2c_adapter i2c-1: nForce2 SMBus adapter at 0x5100
ieee1394: Host added: ID:BUS[0-00:1023] GUID[000000508df0fbe3]
hub 2-0:1.0: new USB device on port 1, assigned address 2
Advanced Linux Sound Architecture Driver Version 0.9.7 (Thu Sep 25
19:16:36 2003 UTC).
request_module: failed /sbin/modprobe -- snd-card-0. error = -16
PCI: Setting latency timer of device 0000:00:06.0 to 64
drivers/usb/class/usblp.c: usblp0: USB Bidirectional printer dev 2 if 0
alt 1 proto 2 vid 0x03F0 pid 0x1004
intel8x0: clocking to 47482
ALSA device list:
#0: NVidia nForce2 at 0xcc081000, irq 5
NET: Registered protocol family 2
IP: routing cache hash table of 8192 buckets, 64Kbytes
TCP: Hash tables configured (established 262144 bind 65536)
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S0 S3 S4 S5)
hde: sata_error = 0x00000000, watchdog = 0, siimage_mmio_ide_dma_test_irq
hde: sata_error = 0x00000000, watchdog = 0, siimage_mmio_ide_dma_test_irq
hde: sata_error = 0x00000000, watchdog = 0, siimage_mmio_ide_dma_test_irq
hde: sata_error = 0x00000000, watchdog = 0, siimage_mmio_ide_dma_test_irq
UDF-fs DEBUG fs/udf/lowlevel.c:65:udf_get_last_session:
CDROMMULTISESSION not supported: rc=-22
UDF-fs DEBUG fs/udf/super.c:1550:udf_fill_super: Multi-session=0
UDF-fs DEBUG fs/udf/super.c:538:udf_vrs: Starting at sector 16 (2048
byte sectors)
UDF-fs: No VRS found
XFS mounting filesystem hde6
Ending clean XFS mount for filesystem: hde6
VFS: Mounted root (xfs filesystem) readonly.
Mounted devfs on /dev
Freeing unused kernel memory: 160k freed
NTFS volume version 3.1.
NTFS volume version 3.1.

Prakash


2003-11-29 15:39:38

by Prakash K. Cheemplavam

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Holy Shit!

I just tried the libata driver and it ROCKSSSS! So far, at least.

I already wrote about the crappy SiI3112 ide driver, now with libata I
get >60mb/sec!!!! More then I get with windows.

Also tests with dd. This rocks. Lets see whether it likes swsup, as well...

So folks, try libata, as well.

I dunno what all is actuall needed. I enabled scsi, scie disk, scsi
generic, sata and its driver. In grub I appended "doataraid noraid".

YES!

Prakash

2003-11-29 16:38:42

by Julien Oster

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

"Prakash K. Cheemplavam" <[email protected]> writes:

Hello Prakash,

> Holy Shit!

> I just tried the libata driver and it ROCKSSSS! So far, at least.
> I already wrote about the crappy SiI3112 ide driver, now with libata I
> get >60mb/sec!!!! More then I get with windows.
> Also tests with dd. This rocks. Lets see whether it likes swsup, as well...

Sounds GREAT!

> So folks, try libata, as well.
> I dunno what all is actuall needed. I enabled scsi, scie disk, scsi
> generic, sata and its driver. In grub I appended "doataraid noraid".
> YES!

I can't find the Silicon Image driver under

"SCSI low-level drivers" -> "Serial ATA (SATA) support"

under 2.6.0-test11. Just the following are there:

ServerWorks Frodo
Intel PIIX/ICH
Promisa SATA
VIA SATA

So, which kernel do I need?

Regards,
Julien

2003-11-29 17:00:28

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sat, Nov 29, 2003 at 04:39:34PM +0100, Prakash K. Cheemplavam wrote:
> Holy Shit!
>
> I just tried the libata driver and it ROCKSSSS! So far, at least.
>
> I already wrote about the crappy SiI3112 ide driver, now with libata I
> get >60mb/sec!!!! More then I get with windows.
>
> Also tests with dd. This rocks. Lets see whether it likes swsup, as well...
>
> So folks, try libata, as well.

Thanks :)

Note that (speaking technically) the SII libata driver doesn't mask all
interrupt conditions, which is why it's listed under CONFIG_BROKEN. So
this translates to "you might get a random lockup", which some users
certainly do see.

For other users, the libata SII driver works flawlessly for them...

Jeff



2003-11-29 16:59:14

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sat, Nov 29, 2003 at 05:38:37PM +0100, Julien Oster wrote:
> I can't find the Silicon Image driver under
>
> "SCSI low-level drivers" -> "Serial ATA (SATA) support"
>
> under 2.6.0-test11. Just the following are there:
>
> ServerWorks Frodo
> Intel PIIX/ICH
> Promisa SATA
> VIA SATA

You need to enable CONFIG_BROKEN :)

Jeff



2003-11-29 17:07:22

by Craig Bradney

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

> I can't find the Silicon Image driver under
>
> "SCSI low-level drivers" -> "Serial ATA (SATA) support"
>
> under 2.6.0-test11. Just the following are there:
>
> ServerWorks Frodo
> Intel PIIX/ICH
> Promisa SATA
> VIA SATA
>

Try under ATA/ATAPI/MFM/RLL support

Silicon Image Chipset Support
CONFIG_BLK_DEV_SIIMAGE: This driver adds PIO/(U)DMA support for the SI CMD680 and SII 3112 (Serial ATA) chips.

Craig



2003-11-29 17:41:16

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

In article <[email protected]>,
Jeff Garzik <[email protected]> wrote:
>Note that (speaking technically) the SII libata driver doesn't mask all
>interrupt conditions, which is why it's listed under CONFIG_BROKEN. So
>this translates to "you might get a random lockup", which some users
>certainly do see.

That begs the question: is that going to be fixed ?

Also, the low performance of the IDE SII driver is because of
the bug with request-size (and the bad workaround). Was that
fixed in the libata version and if so is someone working on
porting that fix to the IDE version of the driver ?

Mike.

2003-11-29 18:39:39

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sat, Nov 29, 2003 at 05:41:14PM +0000, Miquel van Smoorenburg wrote:
> In article <[email protected]>,
> Jeff Garzik <[email protected]> wrote:
> >Note that (speaking technically) the SII libata driver doesn't mask all
> >interrupt conditions, which is why it's listed under CONFIG_BROKEN. So
> >this translates to "you might get a random lockup", which some users
> >certainly do see.
>
> That begs the question: is that going to be fixed ?

Certainly.


> Also, the low performance of the IDE SII driver is because of
> the bug with request-size (and the bad workaround). Was that
> fixed in the libata version and if so is someone working on
> porting that fix to the IDE version of the driver ?

It is fixed in the libata version.

Jeff



2003-11-29 20:25:59

by Marcus Hartig

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jeff Garzik wrote:

> Note that (speaking technically) the SII libata driver doesn't mask all
> interrupt conditions, which is why it's listed under CONFIG_BROKEN. So
> this translates to "you might get a random lockup", which some users
> certainly do see.

However, I've also tested it with my new Maxtor SATA. And I must say:
Many thanks, well done! Now, I can use 2.6.0-test under fedora with
a fine speed ~ 50MB/s in disk reads.

And with GNOME2 under 2.6.0-test11: I can compile the kernel, watch a
movie trailer, play 2 OpenGL screensavers, download an Knoppix ISO
and the desktop has a good performance, like there is nomore running.
Cool! <http://www.marcush.de/screen-2.6.jpg> (250kb)

Jeff, if you ever come to Germany, Hamburg, I will invite you for a good
drink in a fine location. :-)


Greetings,

Marcus

2003-11-30 01:51:34

by Prakash K. Cheemplavam

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Craig Bradney wrote:
>>I can't find the Silicon Image driver under
>>
>>"SCSI low-level drivers" -> "Serial ATA (SATA) support"
>>
>>under 2.6.0-test11. Just the following are there:
>>
>>ServerWorks Frodo
>>Intel PIIX/ICH
>>Promisa SATA
>>VIA SATA
>>
>
>
> Try under ATA/ATAPI/MFM/RLL support
>
> Silicon Image Chipset Support
> CONFIG_BLK_DEV_SIIMAGE: This driver adds PIO/(U)DMA support for the SI CMD680 and SII 3112 (Serial ATA) chips.

No, that is the ide driver that sucks big time.

Prakash

2003-11-30 02:01:06

by Prakash K. Cheemplavam

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jeff Garzik wrote:
> On Sat, Nov 29, 2003 at 04:39:34PM +0100, Prakash K. Cheemplavam wrote:
>>I just tried the libata driver and it ROCKSSSS! So far, at least.
>>
>>I already wrote about the crappy SiI3112 ide driver, now with libata I
>>get >60mb/sec!!!! More then I get with windows.

> Thanks :)

Come on, we must thank you. You don't imagine how frustrated I became of
the SiI bugger. :-)

Prakash

Subject: Re: Silicon Image 3112A SATA trouble


Okay, stop bashing IDE driver... three mails is enough...

Apply this patch and you should get similar performance from IDE driver.
You are probably seeing big improvements with libata driver because you are
using Samsung and IBM/Hitachi drives only, for Seagate it probably sucks just
like IDE driver...

IDE driver limits requests to 15kB for all SATA drives...
libata driver limits requests to 15kB only for Seagata SATA drives...

Both drivers still need proper fix for Seagate drives...

--bart

On Sunday 30 of November 2003 03:00, Prakash K. Cheemplavam wrote:
> Jeff Garzik wrote:
> > On Sat, Nov 29, 2003 at 04:39:34PM +0100, Prakash K. Cheemplavam wrote:
> >>I just tried the libata driver and it ROCKSSSS! So far, at least.
> >>
> >>I already wrote about the crappy SiI3112 ide driver, now with libata I
> >>get >60mb/sec!!!! More then I get with windows.
> >
> > Thanks :)
>
> Come on, we must thank you. You don't imagine how frustrated I became of
> the SiI bugger. :-)
>
> Prakash


[IDE] siimage.c: limit requests to 15kB only for Seagate SATA drives

Fix from jgarzik's sata_sil.c libata driver.

drivers/ide/pci/siimage.c | 23 ++++++++++++++++++++++-
1 files changed, 22 insertions(+), 1 deletion(-)

diff -puN drivers/ide/pci/siimage.c~ide-siimage-seagate drivers/ide/pci/siimage.c
--- linux-2.6.0-test11/drivers/ide/pci/siimage.c~ide-siimage-seagate 2003-11-30 15:38:48.512585200 +0100
+++ linux-2.6.0-test11-root/drivers/ide/pci/siimage.c 2003-11-30 15:38:48.516584592 +0100
@@ -1047,6 +1047,27 @@ static void __init init_mmio_iops_siimag
hwif->mmio = 2;
}

+static int is_dev_seagate_sata(ide_drive_t *drive)
+{
+ const char *s = &drive->id->model[0];
+ unsigned len;
+
+ if (!drive->present)
+ return 0;
+
+ len = strnlen(s, sizeof(drive->id->model));
+
+ if ((len > 4) && (!memcmp(s, "ST", 2))) {
+ if ((!memcmp(s + len - 2, "AS", 2)) ||
+ (!memcmp(s + len - 3, "ASL", 3))) {
+ printk(KERN_INFO "%s: applying pessimistic Seagate "
+ "errata fix\n", drive->name);
+ return 1;
+ }
+ }
+ return 0;
+}
+
/**
* init_iops_siimage - set up iops
* @hwif: interface to set up
@@ -1068,7 +1089,7 @@ static void __init init_iops_siimage (id
hwif->hwif_data = 0;

hwif->rqsize = 128;
- if (is_sata(hwif))
+ if (is_sata(hwif) && is_dev_seagate_sata(&hwif->drives[0]))
hwif->rqsize = 15;

if (pci_get_drvdata(dev) == NULL)

_

2003-11-30 15:52:04

by Prakash K. Cheemplavam

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Bartlomiej Zolnierkiewicz wrote:
> Okay, stop bashing IDE driver... three mails is enough...
>
> Apply this patch and you should get similar performance from IDE driver.
> You are probably seeing big improvements with libata driver because
you are
> using Samsung and IBM/Hitachi drives only, for Seagate it probably
sucks just
> like IDE driver...
>
> IDE driver limits requests to 15kB for all SATA drives...
> libata driver limits requests to 15kB only for Seagata SATA drives...

If you read my message closely then you should have understand that
setting the request highr *didn't* help, ie

echo "max_kb_per_request:128" > /proc/ide/hde/settings

made *no* difference, so I won't even try that patch. As far I have
understood this is exactly the thing you changed in the patch. If I am
mistaken, then I take it back.

Prakash

2003-11-30 16:25:37

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
>
> I read it _very_ closely, here is your original mail with subject
> "Re: 2.6.0-test9 /-mm3 SATA siimage - bad disk performance":
>
> On Saturday 15 of November 2003 10:11, Prakash K. Cheemplavam wrote:
> > Marcus Hartig wrote:
> > > Hello all,
> > >
> > > with the Fedora 1 kernel 2.4.22-1.2115.nptl I get with hdparm -t
> > > (Timing buffered disk reads) 34 MB/sec. Its very slow for this drive.
> > >
> > > With 2.6.0-test9 and -mm3 I get around "62 MB in 3.05 = 20,31". Wow"
> > > Back to ~1998?
> >
> > I have a similar problem: With 2.4.22-ac3 I had 37mb/sec with my Samsung
> > HD and 49MB/sec with IBM/Hitachi, now with 2.6 (all I tried, including
> ^^^^^^^^^^^^
> > test9-mm2) I had only 20mb/sec for Samsung and about 39mb/sec for the
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > IBM. Motherboard is Abit NF7-S Rev2.0, as well, so same situation with
> ^^^^
> > the siimage 1.06 driver. I wanted to run some dd tests as well, but it
> > is a real performance hit. Playing with readahead or other hdparm
> > options didn't help either.
> >
> > Prakash
>
> In 2.6.x there is no max_kb_per_request setting in /proc/ide/hdx/settings.
> Therefore
> echo "max_kb_per_request:128" > /proc/ide/hde/settings
> does not work.
>
> Hmm. actually I was under influence that we have generic ioctls in 2.6.x,
> but I can find only BLKSECTGET, BLKSECTSET was somehow lost. Jens?

Probably because it's very dangerous to expose, echo something too big
and watch your data disappear.

--
Jens Axboe

Subject: Re: Silicon Image 3112A SATA trouble


I read it _very_ closely, here is your original mail with subject
"Re: 2.6.0-test9 /-mm3 SATA siimage - bad disk performance":

On Saturday 15 of November 2003 10:11, Prakash K. Cheemplavam wrote:
> Marcus Hartig wrote:
> > Hello all,
> >
> > with the Fedora 1 kernel 2.4.22-1.2115.nptl I get with hdparm -t
> > (Timing buffered disk reads) 34 MB/sec. Its very slow for this drive.
> >
> > With 2.6.0-test9 and -mm3 I get around "62 MB in 3.05 = 20,31". Wow"
> > Back to ~1998?
>
> I have a similar problem: With 2.4.22-ac3 I had 37mb/sec with my Samsung
> HD and 49MB/sec with IBM/Hitachi, now with 2.6 (all I tried, including
^^^^^^^^^^^^
> test9-mm2) I had only 20mb/sec for Samsung and about 39mb/sec for the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> IBM. Motherboard is Abit NF7-S Rev2.0, as well, so same situation with
^^^^
> the siimage 1.06 driver. I wanted to run some dd tests as well, but it
> is a real performance hit. Playing with readahead or other hdparm
> options didn't help either.
>
> Prakash

In 2.6.x there is no max_kb_per_request setting in /proc/ide/hdx/settings.
Therefore
echo "max_kb_per_request:128" > /proc/ide/hde/settings
does not work.

Hmm. actually I was under influence that we have generic ioctls in 2.6.x,
but I can find only BLKSECTGET, BLKSECTSET was somehow lost. Jens?

Prakash, please try patch and maybe you will have 2 working drivers now :-).

--bart

On Sunday 30 of November 2003 16:52, Prakash K. Cheemplavam wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > Okay, stop bashing IDE driver... three mails is enough...
> >
> > Apply this patch and you should get similar performance from IDE driver.
> > You are probably seeing big improvements with libata driver because
>
> you are
>
> > using Samsung and IBM/Hitachi drives only, for Seagate it probably
>
> sucks just
>
> > like IDE driver...
> >
> > IDE driver limits requests to 15kB for all SATA drives...
> > libata driver limits requests to 15kB only for Seagata SATA drives...
>
> If you read my message closely then you should have understand that
> setting the request highr *didn't* help, ie
>
> echo "max_kb_per_request:128" > /proc/ide/hde/settings
>
> made *no* difference, so I won't even try that patch. As far I have
> understood this is exactly the thing you changed in the patch. If I am
> mistaken, then I take it back.
>
> Prakash

2003-11-30 16:28:16

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Bartlomiej Zolnierkiewicz wrote:
> Apply this patch and you should get similar performance from IDE driver.
> You are probably seeing big improvements with libata driver because you are
> using Samsung and IBM/Hitachi drives only, for Seagate it probably sucks just
> like IDE driver...

Looks good to me.


> IDE driver limits requests to 15kB for all SATA drives...
> libata driver limits requests to 15kB only for Seagata SATA drives...
>
> Both drivers still need proper fix for Seagate drives...

Yep. Do you have the Maxtor fix, as well? It's in libata's SII driver,
though it should be noted that the Maxtor errata only occurs for
PATA<->SATA bridges, and not for real Maxtor SATA drives.

Jeff



Subject: Re: Silicon Image 3112A SATA trouble


Yes, siimage.c contains Maxtor fix as well, there is even comment from
Alan about Marvell PATA<->SATA bridges...

--bart

On Sunday 30 of November 2003 17:27, Jeff Garzik wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > Apply this patch and you should get similar performance from IDE driver.
> > You are probably seeing big improvements with libata driver because you
> > are using Samsung and IBM/Hitachi drives only, for Seagate it probably
> > sucks just like IDE driver...
>
> Looks good to me.
>
> > IDE driver limits requests to 15kB for all SATA drives...
> > libata driver limits requests to 15kB only for Seagata SATA drives...
> >
> > Both drivers still need proper fix for Seagate drives...
>
> Yep. Do you have the Maxtor fix, as well? It's in libata's SII driver,
> though it should be noted that the Maxtor errata only occurs for
> PATA<->SATA bridges, and not for real Maxtor SATA drives.
>
> Jeff

2003-11-30 16:46:05

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Mark Hahn wrote:
>>>So folks, try libata, as well.
>>
>>Thanks :)
>
>
> what do you think the chances are of libata becoming the primary ata
> interface for 2.4 and 2.6? there have always been major changes even
> to stable releases in the past, at least when the change seems to be a
> big improvement.

"primary ata interface" is a bit tough to define. Serial ATA will
become _the_ ATA interface on motherboards of the future. From a
software perspective, it really only matters what hardware driver you
load...


> incidentally, can you give me any clues to description/discussion you
> might have engaged in about libata? I saw your prog-ref pdf, but it
> doesn't really describe the motivation, issues of going scsi, etc.
> (I looked at lkml and google, but couldn't filter well enough...)

Mostly just design in my head, plus a bit of discussion at the Kernel
Summit earlier this year.


> feel free to reply to lkml. libata design/status/future is clearly of
> general interest...

I'm putting together a "Serial ATA status report", to be posted to lkml
and linux-scsi, which should hopefully cover all that. Your email kicks
me into action again, for that report, for which I should thank you :)

Jeff



2003-11-30 16:42:13

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
>>Hmm. actually I was under influence that we have generic ioctls in 2.6.x,
>>but I can find only BLKSECTGET, BLKSECTSET was somehow lost. Jens?
>
>
> Probably because it's very dangerous to expose, echo something too big
> and watch your data disappear.


IMO, agreed.

Max KB per request really should be set by the driver, as it's a
hardware-specific thing that (as we see :)) is often errata-dependent.

Tangent: My non-pessimistic fix will involve submitting a single sector
DMA r/w taskfile manually, then proceeding with the remaining sectors in
another r/w taskfile. This doubles the interrupts on the affected
chipset/drive combos, but still allows large requests. I'm not terribly
fond of partial completions, as I feel they add complexity, particularly
so in my case: I can simply use the same error paths for both the
single-sector taskfile and the "everything else" taskfile, regardless of
which taskfile throws the error.

(thinking out loud) Though best for simplicity, I am curious if a
succession of "tiny/huge" transaction pairs are efficient? I am hoping
that the drive's cache, coupled with the fact that each pair of
taskfiles is sequentially contiguous, will not hurt speed too much over
a non-errata configuration...

Jeff



Subject: Re: Silicon Image 3112A SATA trouble

hello:

I have a Seagate Barracuda IV (80 Gb) connected to parallel ata on a
nforce-2 motherboard.

If any of you want for me to test any patch to fix the "seagate issue",
please, count on me. I have a SATA sis3112 and a parallel-to-serial
converter. If I'm of any help to you, drop me an email.

By the way, I'm only getting 32 MB/s (hdparm -tT /dev/hda) on my
actual parallel ata. Is this enough for an ATA-100 device?

Thanks a lot.

LuisMi Garc?a
Spain

Subject: Re: Silicon Image 3112A SATA trouble

On Sunday 30 of November 2003 17:51, Jens Axboe wrote:
> On Sun, Nov 30 2003, Jeff Garzik wrote:
> > Jens Axboe wrote:
> > >On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
> > >>Hmm. actually I was under influence that we have generic ioctls in
> > >> 2.6.x, but I can find only BLKSECTGET, BLKSECTSET was somehow lost.
> > >> Jens?
> > >
> > >Probably because it's very dangerous to expose, echo something too big
> > >and watch your data disappear.
> >
> > IMO, agreed.
> >
> > Max KB per request really should be set by the driver, as it's a
> > hardware-specific thing that (as we see :)) is often errata-dependent.

Yep.

> Yes, it would be better to have a per-drive (or hwif) extra limiting
> factor if it is needed. For this case it really isn't, so probably not
> the best idea :)
>
> > Tangent: My non-pessimistic fix will involve submitting a single sector
> > DMA r/w taskfile manually, then proceeding with the remaining sectors in
> > another r/w taskfile. This doubles the interrupts on the affected
> > chipset/drive combos, but still allows large requests. I'm not terribly
>
> Or split the request 50/50.

We can't - hardware will lock up.

--bart

2003-11-30 16:51:59

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
> >>Hmm. actually I was under influence that we have generic ioctls in 2.6.x,
> >>but I can find only BLKSECTGET, BLKSECTSET was somehow lost. Jens?
> >
> >
> >Probably because it's very dangerous to expose, echo something too big
> >and watch your data disappear.
>
>
> IMO, agreed.
>
> Max KB per request really should be set by the driver, as it's a
> hardware-specific thing that (as we see :)) is often errata-dependent.

Yes, it would be better to have a per-drive (or hwif) extra limiting
factor if it is needed. For this case it really isn't, so probably not
the best idea :)

> Tangent: My non-pessimistic fix will involve submitting a single sector
> DMA r/w taskfile manually, then proceeding with the remaining sectors in
> another r/w taskfile. This doubles the interrupts on the affected
> chipset/drive combos, but still allows large requests. I'm not terribly

Or split the request 50/50.

> fond of partial completions, as I feel they add complexity, particularly
> so in my case: I can simply use the same error paths for both the
> single-sector taskfile and the "everything else" taskfile, regardless of
> which taskfile throws the error.

It's just a questions of maintaining the proper request state so you
know how much and what part of a request is pending. Requests have been
handled this way ever since clustered requests, that is why
current_nr_sectors differs from nr_sectors. And with hard_* duplicates,
it's pretty easy to extend this a bit. I don't see this as something
complex, and if the alternative you are suggesting (your implementation
idea is not clear to me...) is to fork another request then I think it's
a lot better.

Say you receive a request that violates the magic sector count rule. You
decide to do the first half of the request, and setup your taskfile for
that. You can diminish nr_sectors appropriately, or you can keep this
sector count in the associated taskfile - whatever you prefer. The
end_io path that covers both "normal" and partial IO is basically:

if (!end_that_request_first(rq, 1, sectors))
rq is done
else
rq state is now correctly the 2nd half

In the not-done case, you simply fall out of your isr as you would a
complete request, and let your request_fn just start it again. You don't
even know this request has already been processed.

Depending on whether you remove the request from the queue or not, you
just push the request to the top of the request queue so you are certain
that you start this one next.

So there's really nothing special about partial completions, rather full
completions are a one-shot partial completion :)

> (thinking out loud) Though best for simplicity, I am curious if a
> succession of "tiny/huge" transaction pairs are efficient? I am hoping
> that the drive's cache, coupled with the fact that each pair of
> taskfiles is sequentially contiguous, will not hurt speed too much over
> a non-errata configuration...

My gut would say rather two 64kb than a 124 and 4kb. But you should do
the numbers, of course :). I'd be surprised if the former wouldn't be
more efficient.

--
Jens Axboe

2003-11-30 17:06:25

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Bartlomiej Zolnierkiewicz wrote:
> On Sunday 30 of November 2003 17:51, Jens Axboe wrote:
>>>Tangent: My non-pessimistic fix will involve submitting a single sector
>>>DMA r/w taskfile manually, then proceeding with the remaining sectors in
>>>another r/w taskfile. This doubles the interrupts on the affected
>>>chipset/drive combos, but still allows large requests. I'm not terribly
>>
>>Or split the request 50/50.
>
>
> We can't - hardware will lock up.

Well, the constraint we must satisfy is

sector_count % 15 != 1

(i.e. "== 1" causes the lockup)

Beyond that, any request ratio should be ok AFAIK...

Jeff



2003-11-30 17:08:20

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
> > Yes, it would be better to have a per-drive (or hwif) extra limiting
> > factor if it is needed. For this case it really isn't, so probably not
> > the best idea :)
> >
> > > Tangent: My non-pessimistic fix will involve submitting a single sector
> > > DMA r/w taskfile manually, then proceeding with the remaining sectors in
> > > another r/w taskfile. This doubles the interrupts on the affected
> > > chipset/drive combos, but still allows large requests. I'm not terribly
> >
> > Or split the request 50/50.
>
> We can't - hardware will lock up.

I know the problem. Then don't split 50/50 to the word, my point was to
split it closer to 50/50 than 1 sector + the rest.

--
Jens Axboe

2003-11-30 17:15:13

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
>
> On Sunday 30 of November 2003 18:08, Jens Axboe wrote:
> > On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
> > > > Yes, it would be better to have a per-drive (or hwif) extra limiting
> > > > factor if it is needed. For this case it really isn't, so probably not
> > > > the best idea :)
> > > >
> > > > > Tangent: My non-pessimistic fix will involve submitting a single
> > > > > sector DMA r/w taskfile manually, then proceeding with the remaining
> > > > > sectors in another r/w taskfile. This doubles the interrupts on the
> > > > > affected chipset/drive combos, but still allows large requests. I'm
> > > > > not terribly
> > > >
> > > > Or split the request 50/50.
> > >
> > > We can't - hardware will lock up.
> >
> > I know the problem. Then don't split 50/50 to the word, my point was to
> > split it closer to 50/50 than 1 sector + the rest.
>
> Oh, I understand now and agree.

Cool. BTW to make myself 100% clear, I don't mean "split" as in split
the request, merely the amount issued to the hardware. Request splitting
has such an ugly ring to it :)

--
Jens Axboe

2003-11-30 17:12:02

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Bartlomiej Zolnierkiewicz wrote:
> >On Sunday 30 of November 2003 17:51, Jens Axboe wrote:
> >>>Tangent: My non-pessimistic fix will involve submitting a single sector
> >>>DMA r/w taskfile manually, then proceeding with the remaining sectors in
> >>>another r/w taskfile. This doubles the interrupts on the affected
> >>>chipset/drive combos, but still allows large requests. I'm not terribly
> >>
> >>Or split the request 50/50.
> >
> >
> >We can't - hardware will lock up.
>
> Well, the constraint we must satisfy is
>
> sector_count % 15 != 1

(sector_count % 15 != 1) && (sector_count != 1)

to be more precise :)

--
Jens Axboe

2003-11-30 17:19:37

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Jeff Garzik wrote:
>>fond of partial completions, as I feel they add complexity, particularly
>>so in my case: I can simply use the same error paths for both the
>>single-sector taskfile and the "everything else" taskfile, regardless of
>>which taskfile throws the error.
>
>
> It's just a questions of maintaining the proper request state so you
> know how much and what part of a request is pending. Requests have been
> handled this way ever since clustered requests, that is why
> current_nr_sectors differs from nr_sectors. And with hard_* duplicates,
> it's pretty easy to extend this a bit. I don't see this as something
> complex, and if the alternative you are suggesting (your implementation
> idea is not clear to me...) is to fork another request then I think it's
> a lot better.
[snip howto]

Yeah, I know how to do partial completions. The increased complexity
arises in my driver. It's simply less code in my driver to treat each
transaction as an "all or none" affair.

For the vastly common case, it's less i-cache and less interrupts to do
all-or-none. In the future I'll probably want to put partial
completions in the error path...


>>(thinking out loud) Though best for simplicity, I am curious if a
>>succession of "tiny/huge" transaction pairs are efficient? I am hoping
>>that the drive's cache, coupled with the fact that each pair of
>>taskfiles is sequentially contiguous, will not hurt speed too much over
>>a non-errata configuration...
>
>
> My gut would say rather two 64kb than a 124 and 4kb. But you should do
> the numbers, of course :). I'd be surprised if the former wouldn't be
> more efficient.

That's why I was thinking out loud, and also why I CC'd Eric :) We'll
see. I'll implement whichever is easier first, which will certainly be
better than the current sledgehammer limit. Any improvement over the
current code will provide dramatic performance increases, and we can
tune after that...

Jeff



Subject: Re: Silicon Image 3112A SATA trouble


On Sunday 30 of November 2003 18:08, Jens Axboe wrote:
> On Sun, Nov 30 2003, Bartlomiej Zolnierkiewicz wrote:
> > > Yes, it would be better to have a per-drive (or hwif) extra limiting
> > > factor if it is needed. For this case it really isn't, so probably not
> > > the best idea :)
> > >
> > > > Tangent: My non-pessimistic fix will involve submitting a single
> > > > sector DMA r/w taskfile manually, then proceeding with the remaining
> > > > sectors in another r/w taskfile. This doubles the interrupts on the
> > > > affected chipset/drive combos, but still allows large requests. I'm
> > > > not terribly
> > >
> > > Or split the request 50/50.
> >
> > We can't - hardware will lock up.
>
> I know the problem. Then don't split 50/50 to the word, my point was to
> split it closer to 50/50 than 1 sector + the rest.

Oh, I understand now and agree.

--bart

2003-11-30 17:14:02

by Craig Bradney

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On the topic of speeds.. hdparm -t gives me 56Mb/s on my Maxtor 80Mb 8mb
cache PATA drive. I got that with 2.4.23 pre 8 which was ATA100 and get
just a little more on ATA133 with 2.6. Not sure what people are
expecting on SATA.

Craig

On Sun, 2003-11-30 at 18:52, Luis Miguel Garc?a wrote:
> hello:
>
> I have a Seagate Barracuda IV (80 Gb) connected to parallel ata on a
> nforce-2 motherboard.
>
> If any of you want for me to test any patch to fix the "seagate issue",
> please, count on me. I have a SATA sis3112 and a parallel-to-serial
> converter. If I'm of any help to you, drop me an email.
>
> By the way, I'm only getting 32 MB/s (hdparm -tT /dev/hda) on my
> actual parallel ata. Is this enough for an ATA-100 device?
>
> Thanks a lot.
>
> LuisMi Garc?a
> Spain
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-11-30 17:19:48

by Prakash K. Cheemplavam

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Bartlomiej Zolnierkiewicz wrote:

> In 2.6.x there is no max_kb_per_request setting in /proc/ide/hdx/settings.
> Therefore
> echo "max_kb_per_request:128" > /proc/ide/hde/settings
> does not work.
>
> Hmm. actually I was under influence that we have generic ioctls in 2.6.x,
> but I can find only BLKSECTGET, BLKSECTSET was somehow lost. Jens?
>
> Prakash, please try patch and maybe you will have 2 working drivers now :-).


OK, this driver fixes the transfer rate problem. Nice, so I wanted to do
the right thing, but it didn't work, as you explained... Thanks.

Nevertheless there is still the issue left:

hdparm -d1 /dev/hde makes the drive get major havoc (something like:
ide: dma_intr: status=0x58 { DriveReady, SeekCOmplete, DataRequest}

ide status timeout=0xd8{Busy}; messages taken from swsups kernal panic
). Have to do a hard reset. I guess it is the same reason why swsusp
gets a kernel panic when it sends PM commands to siimage.c. (Mybe the
same error is in libata causing the same kernel panic on swsusp.)

Any clues?

Nice that at least the siimage driver has got some improvement after me
getting on your nerves. ;-)

Prakash

2003-11-30 17:31:42

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sun, Nov 30 2003, Jeff Garzik wrote:
> >>fond of partial completions, as I feel they add complexity, particularly
> >>so in my case: I can simply use the same error paths for both the
> >>single-sector taskfile and the "everything else" taskfile, regardless of
> >>which taskfile throws the error.
> >
> >
> >It's just a questions of maintaining the proper request state so you
> >know how much and what part of a request is pending. Requests have been
> >handled this way ever since clustered requests, that is why
> >current_nr_sectors differs from nr_sectors. And with hard_* duplicates,
> >it's pretty easy to extend this a bit. I don't see this as something
> >complex, and if the alternative you are suggesting (your implementation
> >idea is not clear to me...) is to fork another request then I think it's
> >a lot better.
> [snip howto]
>
> Yeah, I know how to do partial completions. The increased complexity
> arises in my driver. It's simply less code in my driver to treat each
> transaction as an "all or none" affair.
>
> For the vastly common case, it's less i-cache and less interrupts to do
> all-or-none. In the future I'll probably want to put partial
> completions in the error path...

Oh come one, i-cache? We're doing IO here, a cache line more or less in
request handling is absolutely so much in the noise.

What are the "increased complexity" involved with doing partial
completions? You don't even have to know it's a partial request in the
error handling, it's "just the request" state. Honestly, I don't see a
problem there. You'll have to expand on what exactly you see as added
complexity. To me it still seems like the fastest and most elegant way
to handle it. It requires no special attention on request buildup, it
requires no extra request and ugly split-code in the request handling.
And the partial-completions come for free with the block layer code.

> >>(thinking out loud) Though best for simplicity, I am curious if a
> >>succession of "tiny/huge" transaction pairs are efficient? I am hoping
> >>that the drive's cache, coupled with the fact that each pair of
> >>taskfiles is sequentially contiguous, will not hurt speed too much over
> >>a non-errata configuration...
> >
> >
> >My gut would say rather two 64kb than a 124 and 4kb. But you should do
> >the numbers, of course :). I'd be surprised if the former wouldn't be
> >more efficient.
>
> That's why I was thinking out loud, and also why I CC'd Eric :) We'll

Numbers are better than Eric :)

> see. I'll implement whichever is easier first, which will certainly be
> better than the current sledgehammer limit. Any improvement over the

Definitely, the current static limit completely sucks...

> current code will provide dramatic performance increases, and we can
> tune after that...

A path needs to be chosen first, though.

--
Jens Axboe

2003-11-30 17:31:47

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Jens Axboe wrote:
> >>Well, the constraint we must satisfy is
> >>
> >> sector_count % 15 != 1
> >
> >
> > (sector_count % 15 != 1) && (sector_count != 1)
> >
> >to be more precise :)
>
>
> Thanks for the clarification, I did not know that.
>
> Avoiding sector_count==1 requires additional code :( Valid requests
> might be a single sector. With page-based blkdevs requests smaller than
> a page would certainly be infrequent, but are still possible, with bsg
> for example...

You misread it... sector_count == 1 is fine, sector_count % 15 == 1 is
ok when sector_count is 1 (it would have to be, or sector_count == 1
would not be ok :)

--
Jens Axboe

2003-11-30 17:24:58

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
>>Well, the constraint we must satisfy is
>>
>> sector_count % 15 != 1
>
>
> (sector_count % 15 != 1) && (sector_count != 1)
>
> to be more precise :)


Thanks for the clarification, I did not know that.

Avoiding sector_count==1 requires additional code :( Valid requests
might be a single sector. With page-based blkdevs requests smaller than
a page would certainly be infrequent, but are still possible, with bsg
for example...

Jeff



2003-11-30 17:49:09

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Jeff Garzik wrote:
>
>>Jens Axboe wrote:
>>
>>>>Well, the constraint we must satisfy is
>>>>
>>>> sector_count % 15 != 1
>>>
>>>
>>> (sector_count % 15 != 1) && (sector_count != 1)
>>>
>>>to be more precise :)
>>
>>
>>Thanks for the clarification, I did not know that.
>>
>>Avoiding sector_count==1 requires additional code :( Valid requests
>>might be a single sector. With page-based blkdevs requests smaller than
>>a page would certainly be infrequent, but are still possible, with bsg
>>for example...
>
>
> You misread it... sector_count == 1 is fine, sector_count % 15 == 1 is
> ok when sector_count is 1 (it would have to be, or sector_count == 1
> would not be ok :)


Ahh, duh. Thanks again. Yeah, it makes sense since the bug arises from
too-large Serial ATA data transactions on the SATA bus...

Jeff



Subject: Re: Silicon Image 3112A SATA trouble

so definitely, 32 MB/s is almost half the speed that you get. I'm in
2.6-test11. I don't know more options to try. The next will be booting
with "noapic nolapic". Some people reported better results with this.

by the way, I have booted with "doataraid noraid" (no drives connected,
only SATA support in bios), and nothing is shown in the boot messages
(nor dmesg) about libata being loaded. I don't know if I must connect a
hard drive and then the driver shows up, but I don't think that.


Thanks!

LuisMi Garcia


Craig Bradney wrote:

> On the topic of speeds.. hdparm -t gives me 56Mb/s on my Maxtor 80Mb 8mb
> cache PATA drive. I got that with 2.4.23 pre 8 which was ATA100 and get
> just a little more on ATA133 with 2.6. Not sure what people are
> expecting on SATA.
>
> Craig
>
> On Sun, 2003-11-30 at 18:52, Luis Miguel Garc?a wrote:
>
>
>> hello:
>>
>> I have a Seagate Barracuda IV (80 Gb) connected to parallel ata on a
>> nforce-2 motherboard.
>>
>> If any of you want for me to test any patch to fix the "seagate
>> issue", please, count on me. I have a SATA sis3112 and a
>> parallel-to-serial converter. If I'm of any help to you, drop me an
>> email.
>>
>> By the way, I'm only getting 32 MB/s (hdparm -tT /dev/hda) on my
>> actual parallel ata. Is this enough for an ATA-100 device?
>>
>> Thanks a lot.
>>
>> LuisMi Garc?a
>> Spain
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe
>> linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>>
>
>
>
>
>


2003-11-30 17:46:10

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sun, Nov 30 2003, Jeff Garzik wrote:
> >
> >>Jens Axboe wrote:
> >>
> >>>On Sun, Nov 30 2003, Jeff Garzik wrote:
> >>>
> >>>>fond of partial completions, as I feel they add complexity,
> >>>>particularly so in my case: I can simply use the same error paths for
> >>>>both the single-sector taskfile and the "everything else" taskfile,
> >>>>regardless of which taskfile throws the error.
> >>>
> >>>
> >>>It's just a questions of maintaining the proper request state so you
> >>>know how much and what part of a request is pending. Requests have been
> >>>handled this way ever since clustered requests, that is why
> >>>current_nr_sectors differs from nr_sectors. And with hard_* duplicates,
> >>>it's pretty easy to extend this a bit. I don't see this as something
> >>>complex, and if the alternative you are suggesting (your implementation
> >>>idea is not clear to me...) is to fork another request then I think it's
> >>>a lot better.
> >>
> >>[snip howto]
> >>
> >>Yeah, I know how to do partial completions. The increased complexity
> >>arises in my driver. It's simply less code in my driver to treat each
> >>transaction as an "all or none" affair.
> >>
> >>For the vastly common case, it's less i-cache and less interrupts to do
> >>all-or-none. In the future I'll probably want to put partial
> >>completions in the error path...
> >
> >
> >Oh come one, i-cache? We're doing IO here, a cache line more or less in
> >request handling is absolutely so much in the noise.
> >
> >What are the "increased complexity" involved with doing partial
> >completions? You don't even have to know it's a partial request in the
> >error handling, it's "just the request" state. Honestly, I don't see a
> >problem there. You'll have to expand on what exactly you see as added
> >complexity. To me it still seems like the fastest and most elegant way
> >to handle it. It requires no special attention on request buildup, it
> >requires no extra request and ugly split-code in the request handling.
> >And the partial-completions come for free with the block layer code.
>
> libata, drivers/ide, and SCSI all must provide internal "submit this
> taskfile/cdb" API that is decoupled from struct request. Therefore,

Yes

> submitting a transaction pair, or for ATAPI submitting the internal
> REQUEST SENSE, is quite simple and only a few lines of code.

SCSI already does these partial completions...

> Any extra diddling of the hardware, and struct request, to provide
> partial completions is extra code. The hardware is currently set up to
> provide only "it's done" or "it failed" information. Logically, then,
> partial completions must be more code than the current <none> ;-)

That's not a valid argument. Whatever you do, you have to add some lines
of code.

> WRT error handling, according to ATA specs I can look at the error
> information to determine how much of the request, if any, completed
> successfully. (dunno if this is also doable on ATAPI) That's why
> partial completions in the error path make sense to me.

... so if you do partial completions in the normal paths (or rather
allow them), error handling will be simpler. And we all know where the
hard and stupid bugs are - the basically never tested error handling.

> >>see. I'll implement whichever is easier first, which will certainly
> >>be better than the current sledgehammer limit. Any improvement over
> >>the
> >
> >
> >Definitely, the current static limit completely sucks...
> >
> >
> >>current code will provide dramatic performance increases, and we can
> >>tune after that...
> >
> >
> >A path needs to be chosen first, though.
>
> The path has been chosen: the "it works" solution first, then tune.
> :)

Since one path excludes the other, you must choose a path first. Tuning
is honing a path, not rewriting that code.

--
Jens Axboe

2003-11-30 17:42:20

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Jeff Garzik wrote:
>
>>Jens Axboe wrote:
>>
>>>On Sun, Nov 30 2003, Jeff Garzik wrote:
>>>
>>>>fond of partial completions, as I feel they add complexity, particularly
>>>>so in my case: I can simply use the same error paths for both the
>>>>single-sector taskfile and the "everything else" taskfile, regardless of
>>>>which taskfile throws the error.
>>>
>>>
>>>It's just a questions of maintaining the proper request state so you
>>>know how much and what part of a request is pending. Requests have been
>>>handled this way ever since clustered requests, that is why
>>>current_nr_sectors differs from nr_sectors. And with hard_* duplicates,
>>>it's pretty easy to extend this a bit. I don't see this as something
>>>complex, and if the alternative you are suggesting (your implementation
>>>idea is not clear to me...) is to fork another request then I think it's
>>>a lot better.
>>
>>[snip howto]
>>
>>Yeah, I know how to do partial completions. The increased complexity
>>arises in my driver. It's simply less code in my driver to treat each
>>transaction as an "all or none" affair.
>>
>>For the vastly common case, it's less i-cache and less interrupts to do
>>all-or-none. In the future I'll probably want to put partial
>>completions in the error path...
>
>
> Oh come one, i-cache? We're doing IO here, a cache line more or less in
> request handling is absolutely so much in the noise.
>
> What are the "increased complexity" involved with doing partial
> completions? You don't even have to know it's a partial request in the
> error handling, it's "just the request" state. Honestly, I don't see a
> problem there. You'll have to expand on what exactly you see as added
> complexity. To me it still seems like the fastest and most elegant way
> to handle it. It requires no special attention on request buildup, it
> requires no extra request and ugly split-code in the request handling.
> And the partial-completions come for free with the block layer code.

libata, drivers/ide, and SCSI all must provide internal "submit this
taskfile/cdb" API that is decoupled from struct request. Therefore,
submitting a transaction pair, or for ATAPI submitting the internal
REQUEST SENSE, is quite simple and only a few lines of code.

Any extra diddling of the hardware, and struct request, to provide
partial completions is extra code. The hardware is currently set up to
provide only "it's done" or "it failed" information. Logically, then,
partial completions must be more code than the current <none> ;-)


WRT error handling, according to ATA specs I can look at the error
information to determine how much of the request, if any, completed
successfully. (dunno if this is also doable on ATAPI) That's why
partial completions in the error path make sense to me.


>>>>(thinking out loud) Though best for simplicity, I am curious if a
>>>>succession of "tiny/huge" transaction pairs are efficient? I am hoping
>>>>that the drive's cache, coupled with the fact that each pair of
>>>>taskfiles is sequentially contiguous, will not hurt speed too much over
>>>>a non-errata configuration...
>>>
>>>
>>>My gut would say rather two 64kb than a 124 and 4kb. But you should do
>>>the numbers, of course :). I'd be surprised if the former wouldn't be
>>>more efficient.
>>
>>That's why I was thinking out loud, and also why I CC'd Eric :) We'll
>
>
> Numbers are better than Eric :)

Agreed.


>>see. I'll implement whichever is easier first, which will certainly be
>>better than the current sledgehammer limit. Any improvement over the
>
>
> Definitely, the current static limit completely sucks...
>
>
>>current code will provide dramatic performance increases, and we can
>>tune after that...
>
>
> A path needs to be chosen first, though.

The path has been chosen: the "it works" solution first, then tune. :)

Jeff



2003-11-30 17:57:45

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Jeff Garzik wrote:
>
>>Jens Axboe wrote:
>>
>>>On Sun, Nov 30 2003, Jeff Garzik wrote:
>>>
>>>
>>>>Jens Axboe wrote:
>>>>
>>>>
>>>>>On Sun, Nov 30 2003, Jeff Garzik wrote:
>>>>>
>>>>>
>>>>>>fond of partial completions, as I feel they add complexity,
>>>>>>particularly so in my case: I can simply use the same error paths for
>>>>>>both the single-sector taskfile and the "everything else" taskfile,
>>>>>>regardless of which taskfile throws the error.
>>>>>
>>>>>
>>>>>It's just a questions of maintaining the proper request state so you
>>>>>know how much and what part of a request is pending. Requests have been
>>>>>handled this way ever since clustered requests, that is why
>>>>>current_nr_sectors differs from nr_sectors. And with hard_* duplicates,
>>>>>it's pretty easy to extend this a bit. I don't see this as something
>>>>>complex, and if the alternative you are suggesting (your implementation
>>>>>idea is not clear to me...) is to fork another request then I think it's
>>>>>a lot better.
>>>>
>>>>[snip howto]
>>>>
>>>>Yeah, I know how to do partial completions. The increased complexity
>>>>arises in my driver. It's simply less code in my driver to treat each
>>>>transaction as an "all or none" affair.
>>>>
>>>>For the vastly common case, it's less i-cache and less interrupts to do
>>>>all-or-none. In the future I'll probably want to put partial
>>>>completions in the error path...
>>>
>>>
>>>Oh come one, i-cache? We're doing IO here, a cache line more or less in
>>>request handling is absolutely so much in the noise.
>>>
>>>What are the "increased complexity" involved with doing partial
>>>completions? You don't even have to know it's a partial request in the
>>>error handling, it's "just the request" state. Honestly, I don't see a
>>>problem there. You'll have to expand on what exactly you see as added
>>>complexity. To me it still seems like the fastest and most elegant way
>>>to handle it. It requires no special attention on request buildup, it
>>>requires no extra request and ugly split-code in the request handling.
>>>And the partial-completions come for free with the block layer code.
>>
>>libata, drivers/ide, and SCSI all must provide internal "submit this
>>taskfile/cdb" API that is decoupled from struct request. Therefore,
>
>
> Yes
>
>
>>submitting a transaction pair, or for ATAPI submitting the internal
>>REQUEST SENSE, is quite simple and only a few lines of code.
>
>
> SCSI already does these partial completions...
>
>
>>Any extra diddling of the hardware, and struct request, to provide
>>partial completions is extra code. The hardware is currently set up to
>>provide only "it's done" or "it failed" information. Logically, then,
>>partial completions must be more code than the current <none> ;-)
>
>
> That's not a valid argument. Whatever you do, you have to add some lines
> of code.

Right. But the point with mentioning "decouple[...]" above was that the
most simple path is to submit two requests to hardware, and then a
single function call into {scsi|block} to complete the transaction.

Current non-errata case: 1 taskfile, 1 completion func call
Upcoming errata solution: 2 taskfiles, 1 completion func call
Your errata suggestion seems to be: 2 taskfiles, 2 completion func calls

That's obviously more work and more code for the errata case.

And for the non-errata case, partial completions don't make any sense at
all.


>>WRT error handling, according to ATA specs I can look at the error
>>information to determine how much of the request, if any, completed
>>successfully. (dunno if this is also doable on ATAPI) That's why
>>partial completions in the error path make sense to me.
>
>
> ... so if you do partial completions in the normal paths (or rather
> allow them), error handling will be simpler. And we all know where the

In the common non-errata case, there is never a partial completion.


> hard and stupid bugs are - the basically never tested error handling.

I have :) libata error handling is stupid and simple, but it's also
solid and easy to verify. Yet another path to be honed, of course :)


>>>>see. I'll implement whichever is easier first, which will certainly
>>>>be better than the current sledgehammer limit. Any improvement over
>>>>the
>>>
>>>
>>>Definitely, the current static limit completely sucks...
>>>
>>>
>>>
>>>>current code will provide dramatic performance increases, and we can
>>>>tune after that...
>>>
>>>
>>>A path needs to be chosen first, though.
>>
>>The path has been chosen: the "it works" solution first, then tune.
>>:)
>
>
> Since one path excludes the other, you must choose a path first. Tuning
> is honing a path, not rewriting that code.

The first depends on the second. The "it works" solution creates the
path to be honed.

Jeff



2003-11-30 17:57:18

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30, 2003 at 06:10:06PM +0100, Jens Axboe wrote:

> On Sun, Nov 30 2003, Jeff Garzik wrote:
> > Bartlomiej Zolnierkiewicz wrote:
> > >On Sunday 30 of November 2003 17:51, Jens Axboe wrote:
> > >>>Tangent: My non-pessimistic fix will involve submitting a single sector
> > >>>DMA r/w taskfile manually, then proceeding with the remaining sectors in
> > >>>another r/w taskfile. This doubles the interrupts on the affected
> > >>>chipset/drive combos, but still allows large requests. I'm not terribly
> > >>
> > >>Or split the request 50/50.
> > >
> > >
> > >We can't - hardware will lock up.
> >
> > Well, the constraint we must satisfy is
> >
> > sector_count % 15 != 1
>
> (sector_count % 15 != 1) && (sector_count != 1)
>
> to be more precise :)

I think you wanted to say:

(sector_count % 15 != 1) || (sector_count == 1)

--
Vojtech Pavlik
SuSE Labs, SuSE CR

Subject: Re: Silicon Image 3112A SATA trouble

On Sunday 30 of November 2003 18:19, Prakash K. Cheemplavam wrote:
> Bartlomiej Zolnierkiewicz wrote:
> > In 2.6.x there is no max_kb_per_request setting in
> > /proc/ide/hdx/settings. Therefore
> > echo "max_kb_per_request:128" > /proc/ide/hde/settings
> > does not work.
> >
> > Hmm. actually I was under influence that we have generic ioctls in 2.6.x,
> > but I can find only BLKSECTGET, BLKSECTSET was somehow lost. Jens?
> >
> > Prakash, please try patch and maybe you will have 2 working drivers now
> > :-).
>
> OK, this driver fixes the transfer rate problem. Nice, so I wanted to do
> the right thing, but it didn't work, as you explained... Thanks.

Cool.

> Nevertheless there is still the issue left:
>
> hdparm -d1 /dev/hde makes the drive get major havoc (something like:
> ide: dma_intr: status=0x58 { DriveReady, SeekCOmplete, DataRequest}
>
> ide status timeout=0xd8{Busy}; messages taken from swsups kernal panic
> ). Have to do a hard reset. I guess it is the same reason why swsusp
> gets a kernel panic when it sends PM commands to siimage.c. (Mybe the
> same error is in libata causing the same kernel panic on swsusp.)
>
> Any clues?

Strange. While doing 'hdparm -d1 /dev/hde' the same code path is executed
which is executed during boot so probably device is in different state or you
hit some weird driver bug :/.

And you are right, thats the reason why swsusp panics.

--bart

2003-11-30 18:21:49

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sun, Nov 30 2003, Jeff Garzik wrote:
> >
> >>Jens Axboe wrote:
> >>
> >>>On Sun, Nov 30 2003, Jeff Garzik wrote:
> >>>
> >>>
> >>>>Jens Axboe wrote:
> >>>>
> >>>>
> >>>>>On Sun, Nov 30 2003, Jeff Garzik wrote:
> >>>>>
> >>>>>
> >>>>>>fond of partial completions, as I feel they add complexity,
> >>>>>>particularly so in my case: I can simply use the same error paths
> >>>>>>for both the single-sector taskfile and the "everything else"
> >>>>>>taskfile, regardless of which taskfile throws the error.
> >>>>>
> >>>>>
> >>>>>It's just a questions of maintaining the proper request state so you
> >>>>>know how much and what part of a request is pending. Requests have been
> >>>>>handled this way ever since clustered requests, that is why
> >>>>>current_nr_sectors differs from nr_sectors. And with hard_* duplicates,
> >>>>>it's pretty easy to extend this a bit. I don't see this as something
> >>>>>complex, and if the alternative you are suggesting (your implementation
> >>>>>idea is not clear to me...) is to fork another request then I think
> >>>>>it's
> >>>>>a lot better.
> >>>>
> >>>>[snip howto]
> >>>>
> >>>>Yeah, I know how to do partial completions. The increased complexity
> >>>>arises in my driver. It's simply less code in my driver to treat each
> >>>>transaction as an "all or none" affair.
> >>>>
> >>>>For the vastly common case, it's less i-cache and less interrupts to do
> >>>>all-or-none. In the future I'll probably want to put partial
> >>>>completions in the error path...
> >>>
> >>>
> >>>Oh come one, i-cache? We're doing IO here, a cache line more or less in
> >>>request handling is absolutely so much in the noise.
> >>>
> >>>What are the "increased complexity" involved with doing partial
> >>>completions? You don't even have to know it's a partial request in the
> >>>error handling, it's "just the request" state. Honestly, I don't see a
> >>>problem there. You'll have to expand on what exactly you see as added
> >>>complexity. To me it still seems like the fastest and most elegant way
> >>>to handle it. It requires no special attention on request buildup, it
> >>>requires no extra request and ugly split-code in the request handling.
> >>>And the partial-completions come for free with the block layer code.
> >>
> >>libata, drivers/ide, and SCSI all must provide internal "submit this
> >>taskfile/cdb" API that is decoupled from struct request. Therefore,
> >
> >
> >Yes
> >
> >
> >>submitting a transaction pair, or for ATAPI submitting the internal
> >>REQUEST SENSE, is quite simple and only a few lines of code.
> >
> >
> >SCSI already does these partial completions...
> >
> >
> >>Any extra diddling of the hardware, and struct request, to provide
> >>partial completions is extra code. The hardware is currently set up to
> >>provide only "it's done" or "it failed" information. Logically, then,
> >>partial completions must be more code than the current <none> ;-)
> >
> >
> >That's not a valid argument. Whatever you do, you have to add some lines
> >of code.
>
> Right. But the point with mentioning "decouple[...]" above was that the
> most simple path is to submit two requests to hardware, and then a
> single function call into {scsi|block} to complete the transaction.
>
> Current non-errata case: 1 taskfile, 1 completion func call
> Upcoming errata solution: 2 taskfiles, 1 completion func call
> Your errata suggestion seems to be: 2 taskfiles, 2 completion func calls
>
> That's obviously more work and more code for the errata case.

I don't see why, it's exactly 2 x non-errata case.

> And for the non-errata case, partial completions don't make any sense at
> all.

Of course, you would always complete these fully. But having partial
completions at the lowest layer gives it to you for free. non-errata
case uses the exact same path, it just happens to complete 100% of the
request all the time.

> >>WRT error handling, according to ATA specs I can look at the error
> >>information to determine how much of the request, if any, completed
> >>successfully. (dunno if this is also doable on ATAPI) That's why
> >>partial completions in the error path make sense to me.
> >
> >
> >... so if you do partial completions in the normal paths (or rather
> >allow them), error handling will be simpler. And we all know where the
>
> In the common non-errata case, there is never a partial completion.

Right. But as you mention, error handling is a partial completion by
nature (almost always).

> >hard and stupid bugs are - the basically never tested error handling.
>
> I have :) libata error handling is stupid and simple, but it's also
> solid and easy to verify. Yet another path to be honed, of course :)

That's good :). But even given that, error handling is usually the less
tested path (by far). I do commend your 'keep it simple', I think that's
key there.

> >>>>see. I'll implement whichever is easier first, which will certainly
> >>>>be better than the current sledgehammer limit. Any improvement over
> >>>>the
> >>>
> >>>
> >>>Definitely, the current static limit completely sucks...
> >>>
> >>>
> >>>
> >>>>current code will provide dramatic performance increases, and we can
> >>>>tune after that...
> >>>
> >>>
> >>>A path needs to be chosen first, though.
> >>
> >>The path has been chosen: the "it works" solution first, then tune.
> >>:)
> >
> >
> >Since one path excludes the other, you must choose a path first. Tuning
> >is honing a path, not rewriting that code.
>
> The first depends on the second. The "it works" solution creates the
> path to be honed.

Precisely. But there are more than one workable way to fix it :)

--
Jens Axboe

2003-11-30 18:17:51

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Vojtech Pavlik wrote:
> On Sun, Nov 30, 2003 at 06:10:06PM +0100, Jens Axboe wrote:
>
> > On Sun, Nov 30 2003, Jeff Garzik wrote:
> > > Bartlomiej Zolnierkiewicz wrote:
> > > >On Sunday 30 of November 2003 17:51, Jens Axboe wrote:
> > > >>>Tangent: My non-pessimistic fix will involve submitting a single sector
> > > >>>DMA r/w taskfile manually, then proceeding with the remaining sectors in
> > > >>>another r/w taskfile. This doubles the interrupts on the affected
> > > >>>chipset/drive combos, but still allows large requests. I'm not terribly
> > > >>
> > > >>Or split the request 50/50.
> > > >
> > > >
> > > >We can't - hardware will lock up.
> > >
> > > Well, the constraint we must satisfy is
> > >
> > > sector_count % 15 != 1
> >
> > (sector_count % 15 != 1) && (sector_count != 1)
> >
> > to be more precise :)
>
> I think you wanted to say:
>
> (sector_count % 15 != 1) || (sector_count == 1)

Ehm no, I don't think so... To my knowledge, sector_count == 1 is ok. If
not, the hardware would be seriously screwed (ok it is already) beyond
software fixups.

--
Jens Axboe

2003-11-30 18:19:50

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Vojtech Pavlik wrote:
>
>>On Sun, Nov 30, 2003 at 06:10:06PM +0100, Jens Axboe wrote:
>>
>>
>>>On Sun, Nov 30 2003, Jeff Garzik wrote:
>>>
>>>>Bartlomiej Zolnierkiewicz wrote:
>>>>
>>>>>On Sunday 30 of November 2003 17:51, Jens Axboe wrote:
>>>>>
>>>>>>>Tangent: My non-pessimistic fix will involve submitting a single sector
>>>>>>>DMA r/w taskfile manually, then proceeding with the remaining sectors in
>>>>>>>another r/w taskfile. This doubles the interrupts on the affected
>>>>>>>chipset/drive combos, but still allows large requests. I'm not terribly
>>>>>>
>>>>>>Or split the request 50/50.
>>>>>
>>>>>
>>>>>We can't - hardware will lock up.
>>>>
>>>>Well, the constraint we must satisfy is
>>>>
>>>> sector_count % 15 != 1
>>>
>>> (sector_count % 15 != 1) && (sector_count != 1)
>>>
>>>to be more precise :)
>>
>>I think you wanted to say:
>>
>> (sector_count % 15 != 1) || (sector_count == 1)
>
>
> Ehm no, I don't think so... To my knowledge, sector_count == 1 is ok. If
> not, the hardware would be seriously screwed (ok it is already) beyond
> software fixups.


Now that you've kicked my brain into action, yes, sector_count==1 is ok.
It's all about limiting the data FIS... and with sector_count==1
there is no worry about the data FIS in this case.

Jeff


2003-11-30 18:22:57

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sun, Nov 30 2003, Vojtech Pavlik wrote:
> >
> >>On Sun, Nov 30, 2003 at 06:10:06PM +0100, Jens Axboe wrote:
> >>
> >>
> >>>On Sun, Nov 30 2003, Jeff Garzik wrote:
> >>>
> >>>>Bartlomiej Zolnierkiewicz wrote:
> >>>>
> >>>>>On Sunday 30 of November 2003 17:51, Jens Axboe wrote:
> >>>>>
> >>>>>>>Tangent: My non-pessimistic fix will involve submitting a single
> >>>>>>>sector
> >>>>>>>DMA r/w taskfile manually, then proceeding with the remaining
> >>>>>>>sectors in
> >>>>>>>another r/w taskfile. This doubles the interrupts on the affected
> >>>>>>>chipset/drive combos, but still allows large requests. I'm not
> >>>>>>>terribly
> >>>>>>
> >>>>>>Or split the request 50/50.
> >>>>>
> >>>>>
> >>>>>We can't - hardware will lock up.
> >>>>
> >>>>Well, the constraint we must satisfy is
> >>>>
> >>>> sector_count % 15 != 1
> >>>
> >>> (sector_count % 15 != 1) && (sector_count != 1)
> >>>
> >>>to be more precise :)
> >>
> >>I think you wanted to say:
> >>
> >> (sector_count % 15 != 1) || (sector_count == 1)
> >
> >
> >Ehm no, I don't think so... To my knowledge, sector_count == 1 is ok. If
> >not, the hardware would be seriously screwed (ok it is already) beyond
> >software fixups.
>
>
> Now that you've kicked my brain into action, yes, sector_count==1 is ok.
> It's all about limiting the data FIS... and with sector_count==1
> there is no worry about the data FIS in this case.

Ah, my line wasn't completely clear (to say the least)... So to clear
all doubts:

if ((sector_count % 15 == 1) && (sector_count != 1))
errata path

Agree?

--
Jens Axboe

2003-11-30 18:31:53

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> Ah, my line wasn't completely clear (to say the least)... So to clear
> all doubts:
>
> if ((sector_count % 15 == 1) && (sector_count != 1))
> errata path
>
> Agree?


Agreed.


The confusion here is most likely my fault, as my original post
intentionally inverted the logic for illustrative purposes (hah!)...

> Well, the constraint we must satisfy is
>
> sector_count % 15 != 1
>
> (i.e. "== 1" causes the lockup)

And to think, English is my only language...

Jeff



2003-11-30 18:27:28

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Craig Bradney wrote:
> On the topic of speeds.. hdparm -t gives me 56Mb/s on my Maxtor 80Mb 8mb
> cache PATA drive. I got that with 2.4.23 pre 8 which was ATA100 and get
> just a little more on ATA133 with 2.6. Not sure what people are
> expecting on SATA.


Serial ATA merely changes the bus, a.k.a. the interface between drive
and system.

This doesn't mean that the drive itself will be any faster... most
first-gen SATA drives are just PATA drives with a new circuit board and
new firmware. Just like some SCSI and IDE drives are exactly the same
platters, but have differing circuit boards and connectors...

Jeff



2003-11-30 19:04:55

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Jeff Garzik wrote:
>>Current non-errata case: 1 taskfile, 1 completion func call
>>Upcoming errata solution: 2 taskfiles, 1 completion func call
>>Your errata suggestion seems to be: 2 taskfiles, 2 completion func calls
>>
>>That's obviously more work and more code for the errata case.
>
>
> I don't see why, it's exactly 2 x non-errata case.

Since the hardware request API is (and must be) completely decoupled
from struct request API, I can achieve 1.5 x non-errata case.


>>And for the non-errata case, partial completions don't make any sense at
>>all.
>
>
> Of course, you would always complete these fully. But having partial
> completions at the lowest layer gives it to you for free. non-errata
> case uses the exact same path, it just happens to complete 100% of the
> request all the time.

[editor's note: I wonder if I've broken a grammar rule using so many
"non"s in a single email]

If I completely ignore partial completions on the normal [non-error]
paths, the current errata and non-errata struct request completion paths
would be exactly the same. Only the error path would differ. The
lowest [hardware req API] layer's request granularity is a single
taskfile, so it will never know about partial completions.



>>>>WRT error handling, according to ATA specs I can look at the error
>>>>information to determine how much of the request, if any, completed
>>>>successfully. (dunno if this is also doable on ATAPI) That's why
>>>>partial completions in the error path make sense to me.
>>>
>>>
>>>... so if you do partial completions in the normal paths (or rather
>>>allow them), error handling will be simpler. And we all know where the
>>
>>In the common non-errata case, there is never a partial completion.
>
>
> Right. But as you mention, error handling is a partial completion by
> nature (almost always).

Agreed. Just in case I transposed a word or something, I wish to
clarify: both errata and error paths are almost always partial completions.

However... for the case where both errata taskfiles completely
_successfully_, it is better have only 1 completion on the hot path (the
"1.5 x" mentioned above). Particularly considering that errata
taskfiles are contiguous, and the second taskfile will completely fairly
quickly after the first...

The slow, error path is a whole different matter. Ignoring partial
completions in the normal path keeps the error path simple, for errata
and non-errata cases. Handling partial completions in the error code,
for both errata and non-errata cases, is definitely something I want to
do in the future.


>>>hard and stupid bugs are - the basically never tested error handling.
>>
>>I have :) libata error handling is stupid and simple, but it's also
>>solid and easy to verify. Yet another path to be honed, of course :)
>
>
> That's good :). But even given that, error handling is usually the less
> tested path (by far). I do commend your 'keep it simple', I think that's
> key there.

As a tangent, I'm hoping to convince some drive manufacturers (under NDA
most likely, unfortunately) to provide special drive firmwares that will
simulate read and write errors. i.e. fault injection.

Jeff



2003-11-30 19:45:05

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30, 2003 at 01:31:35PM -0500, Jeff Garzik wrote:

> >Ah, my line wasn't completely clear (to say the least)... So to clear
> >all doubts:
> >
> > if ((sector_count % 15 == 1) && (sector_count != 1))
> > errata path
> >
> >Agree?
>
>
> Agreed.
>
>
> The confusion here is most likely my fault, as my original post
> intentionally inverted the logic for illustrative purposes (hah!)...

Yeah, and there was an error in the inversion, since if you invert the
above statement, it looks like this:

if ((sector_count % 15 != 1) || (sector_count == 1))
ok path
else
errata path

Logic can be a bitch at times.

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2003-11-30 19:39:22

by Jens Axboe

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, Nov 30 2003, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Sun, Nov 30 2003, Jeff Garzik wrote:
> >>Current non-errata case: 1 taskfile, 1 completion func call
> >>Upcoming errata solution: 2 taskfiles, 1 completion func call
> >>Your errata suggestion seems to be: 2 taskfiles, 2 completion func calls
> >>
> >>That's obviously more work and more code for the errata case.
> >
> >
> >I don't see why, it's exactly 2 x non-errata case.
>
> Since the hardware request API is (and must be) completely decoupled
> from struct request API, I can achieve 1.5 x non-errata case.

Hmm I don't follow that... Being a bit clever, you could even send off
both A and B parts of the request in one go. Probably not worth it
though, that would add some complexity (things like not spanning a page,
stuff you probably don't want to bother the driver with).

> >>And for the non-errata case, partial completions don't make any sense at
> >>all.
> >
> >
> >Of course, you would always complete these fully. But having partial
> >completions at the lowest layer gives it to you for free. non-errata
> >case uses the exact same path, it just happens to complete 100% of the
> >request all the time.
>
> [editor's note: I wonder if I've broken a grammar rule using so many
> "non"s in a single email]

Hehe

> If I completely ignore partial completions on the normal [non-error]
> paths, the current errata and non-errata struct request completion paths
> would be exactly the same. Only the error path would differ. The
> lowest [hardware req API] layer's request granularity is a single
> taskfile, so it will never know about partial completions.

Indeed. The partial completions only exist at the driver -> block layer
(or -> scsi) layer, not talking to the hardware. The hardware always
gets 'a request', if that just happens to be only a part of a struct
request so be it.

> >>>>WRT error handling, according to ATA specs I can look at the error
> >>>>information to determine how much of the request, if any, completed
> >>>>successfully. (dunno if this is also doable on ATAPI) That's why
> >>>>partial completions in the error path make sense to me.
> >>>
> >>>
> >>>... so if you do partial completions in the normal paths (or rather
> >>>allow them), error handling will be simpler. And we all know where the
> >>
> >>In the common non-errata case, there is never a partial completion.
> >
> >
> >Right. But as you mention, error handling is a partial completion by
> >nature (almost always).
>
> Agreed. Just in case I transposed a word or something, I wish to
> clarify: both errata and error paths are almost always partial
> completions.

Yup agree.

> However... for the case where both errata taskfiles completely
> _successfully_, it is better have only 1 completion on the hot path (the
> "1.5 x" mentioned above). Particularly considering that errata
> taskfiles are contiguous, and the second taskfile will completely fairly
> quickly after the first...

Sure yes, the fewer completions the better. Where do you get the 1.5
from? You need to split the request handling no matter what for the
errata path, I would count that as 2 completions.

> The slow, error path is a whole different matter. Ignoring partial
> completions in the normal path keeps the error path simple, for errata
> and non-errata cases. Handling partial completions in the error code,

How so?? There are no partial completions in the normal path. In fact,
ignore the term 'partial completion'. Just think completion count or
something like that. At end_io time, you look at how much io has
completed, and you complete that back to the layer above you (block or
scsi). The normal path would always have count == total request, and you
are done. The errata (and error) path would have count <= total request,
which just means you have a line or two of C to tell the layer above you
not put the request back at the front. That's about it for added code.

I think we are talking somewhat past each other. I don't mean to imply
that you want partial completions in the non-errata path. Of course you
don't. I'm purely talking about completion of a count of data which
necessarily doesn't have to be the total struct request size. Your
taskfile tells you how much.

> for both errata and non-errata cases, is definitely something I want to
> do in the future.

Well yes, you have to.

> >>>hard and stupid bugs are - the basically never tested error handling.
> >>
> >>I have :) libata error handling is stupid and simple, but it's also
> >>solid and easy to verify. Yet another path to be honed, of course :)
> >
> >
> >That's good :). But even given that, error handling is usually the less
> >tested path (by far). I do commend your 'keep it simple', I think that's
> >key there.
>
> As a tangent, I'm hoping to convince some drive manufacturers (under NDA
> most likely, unfortunately) to provide special drive firmwares that will
> simulate read and write errors. i.e. fault injection.

Would be nice to have ways of doing that which are better than 'mark
this drive bad with a label and pull it ouf of the drawer for testing
error handling' :)

--
Jens Axboe

2003-11-30 20:35:47

by Jeff Garzik

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jens Axboe wrote:
> On Sun, Nov 30 2003, Jeff Garzik wrote:
>>Since the hardware request API is (and must be) completely decoupled
>>from struct request API, I can achieve 1.5 x non-errata case.
>
> Hmm I don't follow that... Being a bit clever, you could even send off
> both A and B parts of the request in one go. Probably not worth it
> though, that would add some complexity (things like not spanning a page,
> stuff you probably don't want to bother the driver with).
[...]
> Indeed. The partial completions only exist at the driver -> block layer
> (or -> scsi) layer, not talking to the hardware. The hardware always
> gets 'a request', if that just happens to be only a part of a struct
> request so be it.
[...]
> Sure yes, the fewer completions the better. Where do you get the 1.5
> from? You need to split the request handling no matter what for the
> errata path, I would count that as 2 completions.

Taskfile completion and struct request completion are separate. That
results in

* struct request received by libata
* libata detects errata
* libata creates 2 struct ata_queued_cmd's
* libata calls ata_qc_push() 2 times
* Time passes
* ata_qc_complete called 2 times
Option 1: {scsi|block} complete called 2 times, once for each taskfile
Option 2: {scsi|block} complete called 1 time, when both taskfiles are done

one way: 2 h/w completions, 1 struct request completion == 1.5
another way: 2 h/w completions, 2 struct request completions == 2.0

Maybe another way of looking at it:
It's a question of where the state is stored -- in ata_queued_cmd or
entirely in struct request -- and what are the benefits/downsides of each.

When a single struct request causes the initiation of multiple
ata_queued_cmd's, libata must be capable of knowing when multiple
ata_queued_cmds forming a whole have completed. struct request must
also know this. _But_. The key distinction is that libata must handle
multiple requests might not be based on sector progress.

For this SII errata, I _could_ do this at the block layer:
ata_qc_complete() -> blk_end_io(first half of sectors)
ata_qc_complete() -> blk_end_io(some more sectors)

And the request would be completed by the block layer (right?).

But under the hood, libata has to handle these situations:
* One or more ATA commands must complete in succession, before the
struct request may be end_io'd.
* One or more ATA commands must complete asynchronously, before the
struct request may be end_io'd.
* These ATA commands might not be sector based: sometimes aggressive
power management means that libata must issue and complete a PM-related
taskfile, before issuing the {READ|WRITE} DMA passed to it in the struct
request.

I'm already storing and handling this stuff at the hardware-queue level.
(remember hardware queues often bottleneck at the host and/or bus
levels, not necessarily the request_queue level)

So what all this hopefully boils down to is: if I have to do "internal
completions" anyway, it's just more work for libata to separate out the
2 taskfiles into 2 block layer completions. For both errata and
non-errata paths, I can just say "the last taskfile is done, clean up"



Yet another way of looking at it:
In order for all state to be kept at the block layer level, you would
need this check:

if ((rq->expected_taskfiles == rq->completed_taskfiles) &&
(rq->expected_sectors == rq->completed_sectors))
the struct request is "complete"

and each call to end_io would require both a taskfile count and a sector
count, which would increment ->completed_taskfiles and ->completed_sectors.

Note1: s/taskfile/cdb/ if that's your fancy :)
Note2: ->completed_sectors exists today under another name, yes, I know :)

Jeff



2003-11-30 21:05:32

by Yaroslav Klyukin

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Jeff Garzik wrote:
> Jens Axboe wrote:
>
>> Ah, my line wasn't completely clear (to say the least)... So to clear
>> all doubts:
>>
>> if ((sector_count % 15 == 1) && (sector_count != 1))
>> errata path
>>
>> Agree?
>
>
>
> Agreed.
>
>
> The confusion here is most likely my fault, as my original post
> intentionally inverted the logic for illustrative purposes (hah!)...
>
>> Well, the constraint we must satisfy is
>>
>> sector_count % 15 != 1
>>
>> (i.e. "== 1" causes the lockup)


Hi, I just rebuilt my kernel with libata support.
I have 3112 Silicon Image controller with IDE drive attahed.
(I have 2 IDE drives, so I bought second controller to split load between them. One is connected to the MB IDE controller.)

When I run hdparm command, I can see strange behaviour of the ilbata driver:

[root@shrike root]# hdparm -t /dev/hda

/dev/hda:
Timing buffered disk reads: 64 MB in 1.19 seconds = 53.78 MB/sec
[root@shrike root]# hdparm -t /dev/sda1

/dev/sda1:
Timing buffered disk reads: read(1048576) returned 921600 bytes
[root@shrike root]#


Meanwhile -T switch works normally.


I know, that siimage support is broken, any ideas, what can possibly cause such errors?
Is it somehow linked to the error discussed in this message thread?


WBR.

2003-11-30 21:15:30

by Craig Bradney

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

On Sun, 2003-11-30 at 19:41, Luis Miguel Garc?a wrote:
> so definitely, 32 MB/s is almost half the speed that you get. I'm in
> 2.6-test11. I don't know more options to try. The next will be booting
> with "noapic nolapic". Some people reported better results with this.
>
> by the way, I have booted with "doataraid noraid" (no drives connected,
> only SATA support in bios), and nothing is shown in the boot messages
> (nor dmesg) about libata being loaded. I don't know if I must connect a
> hard drive and then the driver shows up, but I don't think that.

Depends on a lot of things, especially what else is on that controller.
The way I found things was my involvement in the Scribus DTP project. I
upgraded to the Athlon 2600 from a Duron 900.

That took me from 30+mins from a Scribus compile to 2 mins (!) if there
is no secondary drive. When I had my old 20gb drive plugged in to do
copies off of it.. 11mins. Do you have anything else plugged in there?
Of course, I guess in that case the 8mb drive cache is helping a lot.

Craig


>
>
> Thanks!
>
> LuisMi Garcia
>
>
> Craig Bradney wrote:
>
> > On the topic of speeds.. hdparm -t gives me 56Mb/s on my Maxtor 80Mb 8mb
> > cache PATA drive. I got that with 2.4.23 pre 8 which was ATA100 and get
> > just a little more on ATA133 with 2.6. Not sure what people are
> > expecting on SATA.
> >
> > Craig
> >
> > On Sun, 2003-11-30 at 18:52, Luis Miguel Garc?a wrote:
> >
> >
> >> hello:
> >>
> >> I have a Seagate Barracuda IV (80 Gb) connected to parallel ata on a
> >> nforce-2 motherboard.
> >>
> >> If any of you want for me to test any patch to fix the "seagate
> >> issue", please, count on me. I have a SATA sis3112 and a
> >> parallel-to-serial converter. If I'm of any help to you, drop me an
> >> email.
> >>
> >> By the way, I'm only getting 32 MB/s (hdparm -tT /dev/hda) on my
> >> actual parallel ata. Is this enough for an ATA-100 device?
> >>
> >> Thanks a lot.
> >>
> >> LuisMi Garc?a
> >> Spain
> >>
> >> -
> >> To unsubscribe from this list: send the line "unsubscribe
> >> linux-kernel" in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at http://www.tux.org/lkml/
> >>
> >>
> >
> >
> >
> >
> >
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-11-30 21:34:30

by Prakash K. Cheemplavam

[permalink] [raw]
Subject: Re: Silicon Image 3112A SATA trouble

Bartlomiej Zolnierkiewicz wrote:
> On Sunday 30 of November 2003 18:19, Prakash K. Cheemplavam wrote:
>
>>Bartlomiej Zolnierkiewicz wrote:
>>
>>>In 2.6.x there is no max_kb_per_request setting in
>>>/proc/ide/hdx/settings. Therefore
>>> echo "max_kb_per_request:128" > /proc/ide/hde/settings
>>>does not work.
>>>
>>>Hmm. actually I was under influence that we have generic ioctls in 2.6.x,
>>>but I can find only BLKSECTGET, BLKSECTSET was somehow lost. Jens?
>>>
>>>Prakash, please try patch and maybe you will have 2 working drivers now
>>>:-).
>>
>>OK, this driver fixes the transfer rate problem. Nice, so I wanted to do
>>the right thing, but it didn't work, as you explained... Thanks.
>
>
> Cool.
>
>
>>Nevertheless there is still the issue left:
>>
>>hdparm -d1 /dev/hde makes the drive get major havoc (something like:
>>ide: dma_intr: status=0x58 { DriveReady, SeekCOmplete, DataRequest}
>>
>>ide status timeout=0xd8{Busy}; messages taken from swsups kernal panic
>>). Have to do a hard reset. I guess it is the same reason why swsusp
>>gets a kernel panic when it sends PM commands to siimage.c. (Mybe the
>>same error is in libata causing the same kernel panic on swsusp.)
>>
>>Any clues?
>
>
> Strange. While doing 'hdparm -d1 /dev/hde' the same code path is executed
> which is executed during boot so probably device is in different state or you
> hit some weird driver bug :/.
>
> And you are right, thats the reason why swsusp panics.

I think the bug is, the driver specifically doesn't like my
controller-sata converter-hd combination. As I stated in my very first
message, on HD access the siimage.c constantly calls:

static int siimage_mmio_ide_dma_test_irq (ide_drive_t *drive)
{
ide_hwif_t *hwif = HWIF(drive);
unsigned long base = (unsigned long)hwif->hwif_data;
unsigned long addr = siimage_selreg(hwif, 0x1);

if (SATA_ERROR_REG) {
u32 ext_stat = hwif->INL(base + 0x10);
u8 watchdog = 0;
if (ext_stat & ((hwif->channel) ? 0x40 : 0x10)) {
// u32 sata_error = hwif->INL(SATA_ERROR_REG);
// hwif->OUTL(sata_error, SATA_ERROR_REG);
// watchdog = (sata_error & 0x00680000) ? 1 : 0;
//#if 1
// printk(KERN_WARNING "%s: sata_error = 0x%08x, "
// "watchdog = %d, %s\n",
// drive->name, sata_error, watchdog,
// __FUNCTION__);
//#endif

} else {

Thats why I commented above portions out, otherwise my dmesg gets
flooded. What is strange, when I compile the kernel to *not* enable DMA
at boot, the siimage DMA gets enabled nevertheless, so I am not sure
whether hdparm -d1 and kernel boot take the same path to enable DMA. It
seems some sort of hack within siimage.c is used to enable DMA on my
drive. Remember, I have no native SATA drive, maybe thats the problem.

Prakash