2003-05-10 01:26:46

by Rob Ekl

[permalink] [raw]
Subject: 2.5.69, IDE TCQ can't be enabled

Hi. Please respond directly, as I am not subscribed to the list.

I started testing 2.5.69, and I can't seem to get TCQ enabled on my
drives. The test-tcq.pl script indicates that both drives in this
computer support TCQ:
# ./test-tcq.pl
/proc/ide/ide0/hda (WDC WD1200JB-00CRA1) supports TCQ
/proc/ide/ide3/hdg (Maxtor 6E030L0) supports TCQ

There are no "TCQ enabled" messages that are displayed.
/proc/ide/hda/settings says "using_tcq 0 0 32", and changing that does
nothing, ie "echo using_tcq:32 > /proc/ide/hda/settings" then cat settings
shows still 0 for value.

hdparm -Q /dev/hda shows "queue_depth = 0 (off)", then hdparm -Q 8
/dev/hda gives:
setting DMA queue_depth to 8 (on)
HDIO_SET_QDMA failed: Input/output error
queue_depth = 0 (off)



According to this message:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103718775526780&w=2
there are no requirements for the controller, etc. I have tried both
controllers in the computer: the motherboard SiS controller
(SIS5513/Sis735) and the PCI card PDC20262, with no difference.

I have TCQ enabled in the kernel, with "TCQ on by default". Below are the
.config and dmesg output. What else should I try?



CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_EXPERIMENTAL=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=14
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_KMOD=y
CONFIG_X86_PC=y
CONFIG_MK7=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_PREEMPT=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_NOHIGHMEM=y
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_BUS=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
CONFIG_HOTPLUG=y
CONFIG_PCMCIA_PROBE=y
CONFIG_KCORE_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y
CONFIG_PARPORT_PC_CML1=y
CONFIG_PARPORT_1284=y
CONFIG_PNP=y
CONFIG_PNP_NAMES=y
CONFIG_BLK_DEV_FD=y
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_BLK_DEV_IDE_TCQ=y
CONFIG_BLK_DEV_IDE_TCQ_DEFAULT=y
CONFIG_BLK_DEV_IDE_TCQ_DEPTH=8
CONFIG_IDEDMA_PCI_AUTO=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_BLK_DEV_ADMA=y
CONFIG_BLK_DEV_PDC202XX_OLD=y
CONFIG_BLK_DEV_PDC202XX_NEW=y
CONFIG_BLK_DEV_SIS5513=y
CONFIG_IDEDMA_AUTO=y
CONFIG_BLK_DEV_PDC202XX=y
CONFIG_BLK_DEV_IDE_MODES=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_REPORT_LUNS=y
CONFIG_SCSI_AHA1542=m
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC7XXX_REG_PRETTY_PRINT=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IPV6_SCTP__=y
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_NET_ETHERNET=y
CONFIG_NET_PCI=y
CONFIG_SIS900=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
CONFIG_PRINTER=y
CONFIG_AGP=y
CONFIG_AGP_SIS=y
CONFIG_AGP_SWORKS=y
CONFIG_VIDEO_DEV=y
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
CONFIG_AUTOFS4_FS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_PROC_FS=y
CONFIG_DEVFS_FS=y
CONFIG_DEVFS_MOUNT=y
CONFIG_DEVPTS_FS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_NFS_FS=y
CONFIG_NFSD=y
CONFIG_LOCKD=y
CONFIG_EXPORTFS=y
CONFIG_SUNRPC=y
CONFIG_SMB_FS=y
CONFIG_MSDOS_PARTITION=y
CONFIG_SMB_NLS=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_FB=y
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_SOUND=y
CONFIG_SND=y
CONFIG_SND_SEQUENCER=y
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_EMU10K1=y
CONFIG_SND_INTEL8X0=y
CONFIG_USB=y
CONFIG_USB_DEVICEFS=y
CONFIG_USB_BANDWIDTH=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_UHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_PL2303=y
CONFIG_ZLIB_INFLATE=y
CONFIG_X86_BIOS_REBOOT=y





Linux version 2.5.69 (root@axp) (gcc version 3.2.2) #2 Thu May 8 23:56:39
CDT 2003
Video mode to be used for restore is f00
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
BIOS-e820: 000000001fff0000 - 000000001fff8000 (ACPI data)
BIOS-e820: 000000001fff8000 - 0000000020000000 (ACPI NVS)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved)
BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
511MB LOWMEM available.
On node 0 totalpages: 131056
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 126960 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
ACPI: RSDP (v000 AMI ) @ 0x000fa340
ACPI: RSDT (v001 AMIINT SiS735XX 00000.04096) @ 0x1fff0000
ACPI: FADT (v001 AMIINT SiS735XX 00000.04096) @ 0x1fff0030
ACPI: DSDT (v001 SiS 735 00000.00256) @ 0x00000000
ACPI: BIOS passes blacklist
Building zonelist for node : 0
Kernel command line: BOOT_IMAGE=2569-1 ro root=301 1
Initializing CPU#0
PID hash table entries: 2048 (order 11: 16384 bytes)
Detected 1540.469 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 3047.42 BogoMIPS
Memory: 514536k/524224k available (2641k kernel code, 8940k reserved, 741k
data, 132k init, 0k highmem)
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
-> /dev
-> /dev/console
-> /root
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After generic, caps: 0383f9ff c1c3f9ff 00000000 00000020
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: AMD Athlon(tm) XP 1800+ stepping 02
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v2.0 (20020519)
PCI: PCI BIOS revision 2.10 entry at 0xfdb01, last bus=1
PCI: Using configuration type 1
BIO: pool of 256 setup, 14Kb (56 bytes/bio)
biovec pool[0]: 1 bvecs: 256 entries (12 bytes)
biovec pool[1]: 4 bvecs: 256 entries (48 bytes)
biovec pool[2]: 16 bvecs: 256 entries (192 bytes)
biovec pool[3]: 64 bvecs: 256 entries (768 bytes)
biovec pool[4]: 128 bvecs: 256 entries (1536 bytes)
biovec pool[5]: 256 bvecs: 256 entries (3072 bytes)
ACPI: Subsystem revision 20030418
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: Power Resource [URP1] (off)
ACPI: Power Resource [URP2] (off)
ACPI: Power Resource [FDDP] (off)
ACPI: Power Resource [LPTP] (off)
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 10 11 *12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 12 14 15, disabled)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 12 14 15, disabled)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 *5 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 *10 11 12 14 15)
Linux Plug and Play Support v0.96 (c) Adam Belay
block request queues:
128 requests per read queue
128 requests per write queue
8 requests per batch
enter congestion at 15
exit congestion at 17
SCSI subsystem initialized
drivers/usb/core/usb.c: registered new driver usbfs
drivers/usb/core/usb.c: registered new driver hub
ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 10
ACPI: PCI Interrupt Link [LNKF] enabled at IRQ 11
PCI: Using ACPI for IRQ routing
PCI: if you experience problems, try using option 'pci=noacpi' or even
'acpi=off'
Initializing RT netlink socket
Enabling SEP on CPU 0
devfs: v1.22 (20021013) Richard Gooch ([email protected])
devfs: boot_options: 0x1
Installing knfsd (copyright (C) 1996 [email protected]).
udf: registering filesystem
ACPI: Power Button (FF) [PWRF]
ACPI: Sleep Button (CM) [SLPB]
ACPI: Processor [CPU1] (supports C1)
pty: 256 Unix98 ptys configured
request_module: failed /sbin/modprobe -- parport_lowlevel. error = -16
lp: driver loaded but no devices found
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected SiS 735 chipset
agpgart: Maximum main memory to use for agp memory: 439M
agpgart: AGP aperture is 64M @ 0xd0000000
Serial: 8250/16550 driver $Revision: 1.90 $ IRQ sharing disabled
tts/0 at I/O 0x3f8 (irq = 4) is a 16550A
tts/1 at I/O 0x2f8 (irq = 3) is a 16550A
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
sis900.c: v1.08.06 9/24/2002
eth0: Realtek RTL8201 PHY transceiver found at address 9.
eth0: Using transceiver found at address 9 as default
eth0: SiS 900 PCI Fast Ethernet at 0xc400, IRQ 5, 00:07:95:xx:xx:xx.
Linux video capture interface: v1.00
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
SIS5513: IDE controller at PCI slot 00:02.5
SIS5513: chipset revision 208
SIS5513: not 100% native mode: will probe irqs later
SiS735 ATA 100 controller
ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xff08-0xff0f, BIOS settings: hdc:DMA, hdd:DMA
hda: WDC WD1200JB-00CRA1, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdd: Pioneer DVD-ROM ATAPIModel DVD-116 0107, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
PDC20262: IDE controller at PCI slot 00:0b.0
PDC20262: chipset revision 1
PDC20262: not 100% native mode: will probe irqs later
PDC20262: ROM enabled at 0xcffd0000
PDC20262: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
ide2: BM-DMA at 0xc800-0xc807, BIOS settings: hde:pio, hdf:pio
ide3: BM-DMA at 0xc808-0xc80f, BIOS settings: hdg:pio, hdh:pio
hdg: Maxtor 6E030L0, ATA DISK drive
ide3 at 0xd000-0xd007,0xcc02 on irq 5
hda: host protected area => 1
hda: 234441648 sectors (120034 MB) w/8192KiB Cache, CHS=232581/16/63,
UDMA(100)
/dev/ide/host0/bus0/target0/lun0: p1 p2 p3 p4 < p5 p6 >
hdg: host protected area => 1
hdg: 60058656 sectors (30750 MB) w/2048KiB Cache, CHS=59582/16/63,
UDMA(66)
/dev/ide/host2/bus1/target0/lun0: p1 p2 p3
end_request: I/O error, dev hdd, sector 0
hdd: ATAPI 40X DVD-ROM drive, 256kB Cache, UDMA(66)
Uniform CD-ROM driver Revision: 3.12
end_request: I/O error, dev hdd, sector 0
request_module: failed /sbin/modprobe -- scsi_hostadapter. error = -16
ehci-hcd 00:0d.2: NEC Corporation USB 2.0
ehci-hcd 00:0d.2: irq 11, pci mem e082ef00
Please use the 'usbfs' filetype instead, the 'usbdevfs' name is
deprecated.
ehci-hcd 00:0d.2: new USB bus registered, assigned bus number 1
ehci-hcd 00:0d.2: USB 2.0 enabled, EHCI 0.95, driver 2003-Jan-22
hub 1-0:0: USB hub found
hub 1-0:0: 5 ports detected
ohci-hcd: 2003 Feb 24 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ohci-hcd: block sizes: ed 64 td 64
ohci-hcd 00:02.2: Silicon Integrated S 7001
ohci-hcd 00:02.2: irq 12, pci mem e0830000
ohci-hcd 00:02.2: new USB bus registered, assigned bus number 2
hub 2-0:0: USB hub found
hub 2-0:0: 3 ports detected
ohci-hcd 00:02.3: Silicon Integrated S 7001 (#2)
ohci-hcd 00:02.3: irq 10, pci mem e0832000
ohci-hcd 00:02.3: new USB bus registered, assigned bus number 3
hub 3-0:0: USB hub found
hub 3-0:0: 3 ports detected
ohci-hcd 00:0d.0: NEC Corporation USB
ohci-hcd 00:0d.0: irq 11, pci mem e0834000
ohci-hcd 00:0d.0: new USB bus registered, assigned bus number 4
hub 4-0:0: USB hub found
hub 4-0:0: 3 ports detected
hub 3-0:0: debounce: port 1: delay 100ms stable 4 status 0x301
hub 3-0:0: new USB device on port 1, assigned address 2
ohci-hcd 00:0d.1: NEC Corporation USB (#2)
ohci-hcd 00:0d.1: irq 12, pci mem e0836000
ohci-hcd 00:0d.1: new USB bus registered, assigned bus number 5
hub 5-0:0: USB hub found
hub 5-0:0: 2 ports detected
drivers/usb/host/uhci-hcd.c: USB Universal Host Controller Interface
driver v2.0
Initializing USB Mass Storage driver...
drivers/usb/core/usb.c: registered new driver usb-storage
USB Mass Storage support registered.
input: USB HID v1.10 Mouse [Logitech USB Receiver] on usb-00:02.3-1
drivers/usb/core/usb.c: registered new driver hid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
drivers/usb/core/usb.c: registered new driver usbserial
drivers/usb/serial/usb-serial.c: USB Serial Driver core v2.0
drivers/usb/serial/usb-serial.c: USB Serial support registered for PL-2303
drivers/usb/core/usb.c: registered new driver pl2303
drivers/usb/serial/pl2303.c: Prolific PL2303 USB to serial adaptor driver
v0.9
mice: PS/2 mouse device common for all mice
input: AT Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
Advanced Linux Sound Architecture Driver Version 0.9.2 (Thu Mar 20
13:31:57 2003 UTC).
request_module: failed /sbin/modprobe -- snd-card-0. error = -16
ALSA device list:
#0: Sound Blaster Live! (rev.7) at 0xc000, irq 12
NET4: Linux TCP/IP 1.0 for NET4.0
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 65536)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
UDF-fs DEBUG fs/udf/lowlevel.c:65:udf_get_last_session: CDROMMULTISESSION
not supported: rc=-25
UDF-fs DEBUG fs/udf/super.c:1472:udf_fill_super: Multi-session=0
UDF-fs DEBUG fs/udf/super.c:460:udf_vrs: Starting at sector 16 (2048 byte
sectors)
UDF-fs DEBUG fs/udf/super.c:1208:udf_check_valid: Failed to read byte
32768. Assuming open disc. Skipping validity check
UDF-fs DEBUG fs/udf/misc.c:286:udf_read_tagged: location mismatch block
256, tag 3274178688 != 256
UDF-fs DEBUG fs/udf/super.c:1262:udf_load_partition: No Anchor block found
UDF-fs: No partition found (1)
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device ide/host0/bus0/target0/lun0/par, size
8192, journal first block 18, max trans len 1024, max batch 900, max
commit age 30, max trans age 30
reiserfs: checking transaction log (ide/host0/bus0/target0/lun0/par) for
(ide/host0/bus0/target0/lun0/par)
Using r5 hash to sort names
VFS: Mounted root (reiserfs filesystem) readonly.
Mounted devfs on /dev
Freeing unused kernel memory: 132k freed
Adding 530136k swap on /dev/hda2. Priority:2 extents:1
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device ide/host2/bus1/target0/lun0/par, size
8192, journal first block 18, max trans len 1024, max batch 900, max
commit age 30, max trans age 30
reiserfs: checking transaction log (ide/host2/bus1/target0/lun0/par) for
(ide/host2/bus1/target0/lun0/par)
Using r5 hash to sort names
eth0: Media Link On 100mbps full-duplex
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete
DataRequest }
blk: queue c048ba9c, I/O limit 4095Mb (mask 0xffffffff)
blk: queue c048d4c4, I/O limit 4095Mb (mask 0xffffffff)
hda: dma_timer_expiry: dma status == 0x60
hda: timeout waiting for DMA
hda: timeout waiting for DMA
hda: (__ide_dma_test_irq) called while not waiting



2003-05-10 16:40:51

by Rob Ekl

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled *RESOLVED*

On Fri, 9 May 2003 [email protected] wrote:

> I started testing 2.5.69, and I can't seem to get TCQ enabled on my
> drives. The test-tcq.pl script indicates that both drives in this
> computer support TCQ:
> # ./test-tcq.pl
> /proc/ide/ide0/hda (WDC WD1200JB-00CRA1) supports TCQ
> /proc/ide/ide3/hdg (Maxtor 6E030L0) supports TCQ
>


I looked at this a little more. It seems that the test-tcq.pl that I
downloaded did not have the bit-shift operators correct. I don't know how
that happened, but it caused it to report incorrectly the TCQ support for
my drives.

After looking at the bits in the "drive features supported" and comparing
against the spec, neither drive supports TCQ. The test-tcq.pl script with
the correct bit-shift operators also shows that neither drive supports
TCQ.

So, I guess you can ignore my previous message...

2003-05-10 17:13:31

by Rob Ekl

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled *RESOLVED*

> After looking at the bits in the "drive features supported" and comparing
> against the spec, neither drive supports TCQ. The test-tcq.pl script with
> the correct bit-shift operators also shows that neither drive supports
> TCQ.


For future reference, hdparm also displays if the drive supports command
queuing, since version 4.2. If supported, it will show "Cmd queuing"
under "Capabilities".

2003-05-12 12:34:18

by Oleg Drokin

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

Hello!

On Fri, May 09, 2003 at 08:38:51PM -0500, [email protected] wrote:
> I started testing 2.5.69, and I can't seem to get TCQ enabled on my
> drives. The test-tcq.pl script indicates that both drives in this
> computer support TCQ:
> # ./test-tcq.pl
> /proc/ide/ide0/hda (WDC WD1200JB-00CRA1) supports TCQ
> /proc/ide/ide3/hdg (Maxtor 6E030L0) supports TCQ
> found reiserfs format "3.6" with standard journal
> Reiserfs journal params: device ide/host2/bus1/target0/lun0/par, size
> 8192, journal first block 18, max trans len 1024, max batch 900, max
> commit age 30, max trans age 30
> reiserfs: checking transaction log (ide/host2/bus1/target0/lun0/par) for
> (ide/host2/bus1/target0/lun0/par)

Just a note that we have found TCQ unusable on our IBM drives and we had
some reports about TCQ unusable on some WD drives.

Unusable means severe FS corruptions starting from mount.
So if your FSs will suddenly start to break, start looking for cause with
disabling TCQ, please.

Bye,
Oleg

2003-05-12 12:42:44

by Oliver Neukum

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled


> Just a note that we have found TCQ unusable on our IBM drives and we had
> some reports about TCQ unusable on some WD drives.
>
> Unusable means severe FS corruptions starting from mount.
> So if your FSs will suddenly start to break, start looking for cause with
> disabling TCQ, please.

I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69

Regards
Oliver

Subject: Re: 2.5.69, IDE TCQ can't be enabled


On Mon, 12 May 2003, Oliver Neukum wrote:

> > Just a note that we have found TCQ unusable on our IBM drives and we had
> > some reports about TCQ unusable on some WD drives.
> >
> > Unusable means severe FS corruptions starting from mount.
> > So if your FSs will suddenly start to break, start looking for cause with
> > disabling TCQ, please.
>
> I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
>
> Regards
> Oliver

TCQ is marked EXPERIMENTAL and is known to be broken.
Probably it should be marked DANGEROUS or removed?

Alan, what do you think?

--
Bartlomiej

2003-05-12 13:10:11

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Bartlomiej Zolnierkiewicz wrote:
>
> On Mon, 12 May 2003, Oliver Neukum wrote:
>
> > > Just a note that we have found TCQ unusable on our IBM drives and we had
> > > some reports about TCQ unusable on some WD drives.
> > >
> > > Unusable means severe FS corruptions starting from mount.
> > > So if your FSs will suddenly start to break, start looking for cause with
> > > disabling TCQ, please.
> >
> > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
> >
> > Regards
> > Oliver
>
> TCQ is marked EXPERIMENTAL and is known to be broken.
> Probably it should be marked DANGEROUS or removed?

Something external probably broke it long ago, I think it can be fixed
pretty easily. I just need to do it... Perhaps just removing the config
option would be safest?

--
Jens Axboe

2003-05-12 13:09:31

by Oleg Drokin

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

Hello!

On Mon, May 12, 2003 at 03:16:17PM +0200, Bartlomiej Zolnierkiewicz wrote:
> > > Just a note that we have found TCQ unusable on our IBM drives and we had
> > > some reports about TCQ unusable on some WD drives.
> > > Unusable means severe FS corruptions starting from mount.
> > > So if your FSs will suddenly start to break, start looking for cause with
> > > disabling TCQ, please.
> > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
> TCQ is marked EXPERIMENTAL and is known to be broken.
> Probably it should be marked DANGEROUS or removed?

How do you think people will test code that is removed?
Or do you mean that nobody plans to look at this ever?
I remember that Jens Axboe promised to take a look at it some
months ago.

Bye,
Oleg

2003-05-12 13:11:32

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Oleg Drokin wrote:
> Hello!
>
> On Mon, May 12, 2003 at 03:16:17PM +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > Just a note that we have found TCQ unusable on our IBM drives and we had
> > > > some reports about TCQ unusable on some WD drives.
> > > > Unusable means severe FS corruptions starting from mount.
> > > > So if your FSs will suddenly start to break, start looking for cause with
> > > > disabling TCQ, please.
> > > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
> > TCQ is marked EXPERIMENTAL and is known to be broken.
> > Probably it should be marked DANGEROUS or removed?
>
> How do you think people will test code that is removed?
> Or do you mean that nobody plans to look at this ever?
> I remember that Jens Axboe promised to take a look at it some
> months ago.

Yeah, that is correct. OTOH, it's not a great loss. The SATA tcq will be
much better, ide tcq is a really horrible beast.

--
Jens Axboe

Subject: Re: 2.5.69, IDE TCQ can't be enabled


On Mon, 12 May 2003, Oleg Drokin wrote:

> Hello!
>
> On Mon, May 12, 2003 at 03:16:17PM +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > Just a note that we have found TCQ unusable on our IBM drives and we had
> > > > some reports about TCQ unusable on some WD drives.
> > > > Unusable means severe FS corruptions starting from mount.
> > > > So if your FSs will suddenly start to break, start looking for cause with
> > > > disabling TCQ, please.
> > > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
> > TCQ is marked EXPERIMENTAL and is known to be broken.
> > Probably it should be marked DANGEROUS or removed?
>
> How do you think people will test code that is removed?

I wanted to remove config option, just like it is for ide-tape.c
currently. But yes, its better to mark it DANGEROUS...

> Or do you mean that nobody plans to look at this ever?
> I remember that Jens Axboe promised to take a look at it some
> months ago.

many months ago :\

btw, some older disks have broken TCQ implementation...
--
Bartlomiej

> Bye,
> Oleg

2003-05-12 13:23:36

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

Jens Axboe wrote:
> SATA tcq will be
> much better, ide tcq is a really horrible beast.


Seconded. ;-)

Jeff




2003-05-12 13:57:59

by Alan

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Llu, 2003-05-12 at 14:16, Bartlomiej Zolnierkiewicz wrote:
> TCQ is marked EXPERIMENTAL and is known to be broken.
> Probably it should be marked DANGEROUS or removed?
>
> Alan, what do you think?

If not then the drivers with their own request end code also need
fixing. I'd turn it off, its not as if the drive firmware seems too
happy about it either

2003-05-12 18:03:05

by Mudama, Eric

[permalink] [raw]
Subject: RE: 2.5.69, IDE TCQ can't be enabled


The only difference between SATA TCQ and PATA TCQ is that in PATA TCQ, the
drive doesn't report the active tag bitmap back to the host after each
command. Other than that they are functionally identical to my
understanding. (Yes, there are options like first-party DMA, but these are
not requirements)

Personally I'd like to see the option stay in there as experimental, it
helps us drive folks test stuff when we can just flip an option off/on to
get that functionality.

--eric

-----Original Message-----
From: Jens Axboe [mailto:[email protected]]
Sent: Monday, May 12, 2003 7:24 AM
To: Oleg Drokin
Cc: Bartlomiej Zolnierkiewicz; Alan Cox; Oliver Neukum;
[email protected]; [email protected]
Subject: Re: 2.5.69, IDE TCQ can't be enabled


On Mon, May 12 2003, Oleg Drokin wrote:
> Hello!
>
> On Mon, May 12, 2003 at 03:16:17PM +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > Just a note that we have found TCQ unusable on our IBM drives and we
had
> > > > some reports about TCQ unusable on some WD drives.
> > > > Unusable means severe FS corruptions starting from mount.
> > > > So if your FSs will suddenly start to break, start looking for cause
with
> > > > disabling TCQ, please.
> > > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
> > TCQ is marked EXPERIMENTAL and is known to be broken.
> > Probably it should be marked DANGEROUS or removed?
>
> How do you think people will test code that is removed?
> Or do you mean that nobody plans to look at this ever?
> I remember that Jens Axboe promised to take a look at it some
> months ago.

Yeah, that is correct. OTOH, it's not a great loss. The SATA tcq will be
much better, ide tcq is a really horrible beast.

--
Jens Axboe

2003-05-12 18:46:56

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Mudama, Eric wrote:
>
> The only difference between SATA TCQ and PATA TCQ is that in PATA TCQ, the
> drive doesn't report the active tag bitmap back to the host after each
> command. Other than that they are functionally identical to my
> understanding. (Yes, there are options like first-party DMA, but these are
> not requirements)

You are ignoring the host side of things. PATA TCQ is basically
unsupportable without some hardware support (auto-poll). It's my
understanding that all SATA controllers do that.

Then there's the debate of whether TCQ is worth it at all, in general. I
feel that a few tags just to minimize the time spent when ending a
request to starting a new one is nice.

> Personally I'd like to see the option stay in there as experimental, it
> helps us drive folks test stuff when we can just flip an option off/on to
> get that functionality.

I agree, besides it just needs a bit of fixing, can't be much.

--
Jens Axboe

2003-05-12 19:08:12

by Mudama, Eric

[permalink] [raw]
Subject: RE: 2.5.69, IDE TCQ can't be enabled



-----Original Message-----
>From: Jens Axboe [mailto:[email protected]]
>Sent: Monday, May 12, 2003 1:00 PM
>To: Mudama, Eric
>Cc: Oleg Drokin; Bartlomiej Zolnierkiewicz; Alan Cox; Oliver Neukum;
>[email protected]; [email protected]
>Subject: Re: 2.5.69, IDE TCQ can't be enabled
>
>
>On Mon, May 12 2003, Mudama, Eric wrote:
>> The only difference between SATA TCQ and PATA TCQ is that in PATA TCQ,
the
>> drive doesn't report the active tag bitmap back to the host after each
>> command. Other than that they are functionally identical to my
>> understanding. (Yes, there are options like first-party DMA, but these
are
>> not requirements)
>
>You are ignoring the host side of things. PATA TCQ is basically
>unsupportable without some hardware support (auto-poll). It's my
>understanding that all SATA controllers do that.

The drive is always supposed to generate an interrupt when it sets the
service bit indicating it is ready to receive a service command.

The release interrupt tells the host the drive is doing a bus release
following a queued data command.

The service interrupt tells you you're going DRQ after receiving the service
command.

Maybe there are drives out there that don't support the configuration of
these interrupts... if that is the case, TCQ will never work "well" with
them since you'll need to poll on timer ticks or something, resulting in a
huge performance loss.

I can also picture autopolling for speed on a host controller, done by the
firmware/asic in the background, but all the SATA goodness should be
"achievable" using the interrupt architecture in the design.

>Then there's the debate of whether TCQ is worth it at all, in general. I
>feel that a few tags just to minimize the time spent when ending a
>request to starting a new one is nice.

TCQ shouldn't benefit writes significantly from a performance perspective if
the drive is reasonably smart. TCQ *will* have a huge performance
improvement for random reads since the drive can order responses based on
minimal rotational latency.

Increasing queue depth reduces the average seek time between commands, both
in distance and rotational latency. Provided a drive doesn't do dumb stuff
like we discussed earlier, then it should be good.

>> Personally I'd like to see the option stay in there as experimental, it
>> helps us drive folks test stuff when we can just flip an option off/on to
>> get that functionality.
>
>I agree, besides it just needs a bit of fixing, can't be much.

I'll do what I can to help in my spare time.

--eric

2003-05-12 19:22:30

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12, 2003 at 01:19:36PM -0600, Mudama, Eric wrote:
> >You are ignoring the host side of things. PATA TCQ is basically
> >unsupportable without some hardware support (auto-poll). It's my
> >understanding that all SATA controllers do that.
>
> The drive is always supposed to generate an interrupt when it sets the
> service bit indicating it is ready to receive a service command.
>
> The release interrupt tells the host the drive is doing a bus release
> following a queued data command.
>
> The service interrupt tells you you're going DRQ after receiving the service
> command.

Most Linux people with TCQ drives seem to have Hitachi (nee IBM)
ones AFAICS. These do not have a service interrupt (or at least,
do not report such)

They do have the release interrupt.


> Maybe there are drives out there that don't support the configuration of
> these interrupts... if that is the case, TCQ will never work "well" with
> them since you'll need to poll on timer ticks or something, resulting in a
> huge performance loss.

yep :)


> >Then there's the debate of whether TCQ is worth it at all, in general. I
> >feel that a few tags just to minimize the time spent when ending a
> >request to starting a new one is nice.
>
> TCQ shouldn't benefit writes significantly from a performance perspective if
> the drive is reasonably smart. TCQ *will* have a huge performance
> improvement for random reads since the drive can order responses based on
> minimal rotational latency.

You hit the nail on the head.

With the host interface limitation of a single scatterlist
particularly, writes do not benefit very much at all. However,
since reads can be queued and buffered internally in the drive,
TCQ will definitely show benefits.

Coming from an OS perspective, I think we really want to be able to
queue up a bunch of scatterlists, like the new AHCI spec does.


> >> Personally I'd like to see the option stay in there as experimental, it
> >> helps us drive folks test stuff when we can just flip an option off/on to
> >> get that functionality.
> >
> >I agree, besides it just needs a bit of fixing, can't be much.
>
> I'll do what I can to help in my spare time.

Great! Your knowledge and experience is much appreciated.

Regards,

Jeff



2003-05-12 19:30:00

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12, 2003 at 11:58:10AM -0600, Mudama, Eric wrote:
> The only difference between SATA TCQ and PATA TCQ is that in PATA TCQ, the
> drive doesn't report the active tag bitmap back to the host after each
> command. Other than that they are functionally identical to my
> understanding. (Yes, there are options like first-party DMA, but these are
> not requirements)

That's from the "drive side." From the OS side, the ideal
implementation isn't here yet :)

Ideally there is a DMA ring of taskfiles and scatterlists. The OS
(producer) queues these up asynchrously, and the host+devices
(consumer) executes the taskfiles in the ring. AHCI does this.

With PATA TCQ, we only have a single scatterlist, and are forced to
have more OS-side infrastructure for command queueing, processing, etc.

As an aside, as drives and hosts get faster, we will actually want
_fewer_ interrupts (i.e. interrupt coalescing).

All this points to making the host smarter.
The drives are already pretty damn smart ;-)

Jeff



2003-05-12 19:30:21

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Jeff Garzik wrote:
> On Mon, May 12, 2003 at 01:19:36PM -0600, Mudama, Eric wrote:
> > >You are ignoring the host side of things. PATA TCQ is basically
> > >unsupportable without some hardware support (auto-poll). It's my
> > >understanding that all SATA controllers do that.
> >
> > The drive is always supposed to generate an interrupt when it sets the
> > service bit indicating it is ready to receive a service command.
> >
> > The release interrupt tells the host the drive is doing a bus release
> > following a queued data command.
> >
> > The service interrupt tells you you're going DRQ after receiving the service
> > command.
>
> Most Linux people with TCQ drives seem to have Hitachi (nee IBM)
> ones AFAICS. These do not have a service interrupt (or at least,
> do not report such)

Nonsense, it supports the service interrupt just fine. It will just
complain if you try to turn it off, iirc.

> They do have the release interrupt.

Which we don't use. To be interesting, you need to speculatively turn on
the dma engine for each command you want to start. If you don't do that,
then it's faster just to poll for release/no-release at command start
time.

You should probably read ide-tcq to see what we do :-)

> > >Then there's the debate of whether TCQ is worth it at all, in general. I
> > >feel that a few tags just to minimize the time spent when ending a
> > >request to starting a new one is nice.
> >
> > TCQ shouldn't benefit writes significantly from a performance perspective if
> > the drive is reasonably smart. TCQ *will* have a huge performance
> > improvement for random reads since the drive can order responses based on
> > minimal rotational latency.
>
> You hit the nail on the head.
>
> With the host interface limitation of a single scatterlist
> particularly, writes do not benefit very much at all. However,
> since reads can be queued and buffered internally in the drive,
> TCQ will definitely show benefits.

I don't think the multiple pending _and_ active is that big a deal, and
besides _everybody_ uses write back caching on IDE which makes TCQ for
writes very uninteresting.

> Coming from an OS perspective, I think we really want to be able to
> queue up a bunch of scatterlists, like the new AHCI spec does.

I have to agree with Eric that the largest win is potentially not
getting hit by the rotational latency all the time. I don't think you'll
get much extra from actually having more than one active from the dma
POV.

--
Jens Axboe

2003-05-12 19:32:14

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Jens Axboe wrote:
> > Coming from an OS perspective, I think we really want to be able to
> > queue up a bunch of scatterlists, like the new AHCI spec does.
>
> I have to agree with Eric that the largest win is potentially not
> getting hit by the rotational latency all the time. I don't think you'll
> get much extra from actually having more than one active from the dma
> POV.

Actually, thinking a bit about it, if you could have more than one
active command then the release interrupt gets more interesting.

I've been brain damaged from working on the current stuff. It's one
thing what the spec tells you, it's another what really works in reality
:/

--
Jens Axboe

2003-05-12 19:40:48

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12, 2003 at 09:42:45PM +0200, Jens Axboe wrote:
> On Mon, May 12 2003, Jeff Garzik wrote:
> > Most Linux people with TCQ drives seem to have Hitachi (nee IBM)
> > ones AFAICS. These do not have a service interrupt (or at least,
> > do not report such)
>
> Nonsense, it supports the service interrupt just fine. It will just
> complain if you try to turn it off, iirc.

Weird. Mine doesn't seem to assert it, nor does the identify page
indicate it's supported. Maybe I have a broken drive firmware.


> > They do have the release interrupt.
>
> Which we don't use. To be interesting, you need to speculatively turn on
> the dma engine for each command you want to start. If you don't do that,
> then it's faster just to poll for release/no-release at command start
> time.

That's an annoying thing about ATA TCQ: the command _may_ execute
immediately, or may be queued (even when queue is empty). At least
that's how I read the code and specs...


> I don't think the multiple pending _and_ active is that big a deal, and
> besides _everybody_ uses write back caching on IDE which makes TCQ for
> writes very uninteresting.
[...]
> I have to agree with Eric that the largest win is potentially not
> getting hit by the rotational latency all the time. I don't think you'll
> get much extra from actually having more than one active from the dma
> POV.

Yes and no. I am coming from a driver-complexity perspective:
single-active is more annoying on the driver side.

In terms of drive performance, multiple active probably doesn't make
a huge difference. In terms of reduction in host CPU usage, there
is a performance gain there with multiple active.

Jeff



2003-05-12 22:20:35

by Christer Weinigel

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

Jens Axboe <[email protected]> writes:

> I don't think the multiple pending _and_ active is that big a deal, and
> besides _everybody_ uses write back caching on IDE which makes TCQ for
> writes very uninteresting.

Isn't it recommended to turn off write back caching when doing
software raid? It will be hard to guarantee the consistency of the
raid set otherwise. So I think that TCQ can be very interesting for
some loads.

/Christer

--
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux
Christer Weinigel <[email protected]> http://www.weinigel.se

2003-05-13 06:29:16

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Christer Weinigel wrote:
> Jens Axboe <[email protected]> writes:
>
> > I don't think the multiple pending _and_ active is that big a deal, and
> > besides _everybody_ uses write back caching on IDE which makes TCQ for
> > writes very uninteresting.
>
> Isn't it recommended to turn off write back caching when doing
> software raid? It will be hard to guarantee the consistency of the
> raid set otherwise. So I think that TCQ can be very interesting for
> some loads.

And for journalled file systems, for instance. But yes generally you are
right.

--
Jens Axboe

2003-05-13 06:28:29

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Jeff Garzik wrote:
> On Mon, May 12, 2003 at 09:42:45PM +0200, Jens Axboe wrote:
> > On Mon, May 12 2003, Jeff Garzik wrote:
> > > Most Linux people with TCQ drives seem to have Hitachi (nee IBM)
> > > ones AFAICS. These do not have a service interrupt (or at least,
> > > do not report such)
> >
> > Nonsense, it supports the service interrupt just fine. It will just
> > complain if you try to turn it off, iirc.
>
> Weird. Mine doesn't seem to assert it, nor does the identify page
> indicate it's supported. Maybe I have a broken drive firmware.

Then the linux code won't work on it, have you tried? I've tried a lot
of different IBM models, they all do service interrupts just fine.

> > > They do have the release interrupt.
> >
> > Which we don't use. To be interesting, you need to speculatively turn on
> > the dma engine for each command you want to start. If you don't do that,
> > then it's faster just to poll for release/no-release at command start
> > time.
>
> That's an annoying thing about ATA TCQ: the command _may_ execute
> immediately, or may be queued (even when queue is empty). At least
> that's how I read the code and specs...

That's correct, you can use the release interrupt to get around that...

> > I don't think the multiple pending _and_ active is that big a deal, and
> > besides _everybody_ uses write back caching on IDE which makes TCQ for
> > writes very uninteresting.
> [...]
> > I have to agree with Eric that the largest win is potentially not
> > getting hit by the rotational latency all the time. I don't think you'll
> > get much extra from actually having more than one active from the dma
> > POV.
>
> Yes and no. I am coming from a driver-complexity perspective:
> single-active is more annoying on the driver side.
>
> In terms of drive performance, multiple active probably doesn't make
> a huge difference. In terms of reduction in host CPU usage, there
> is a performance gain there with multiple active.

It should make a non-neglible difference in smart positioning in the
drive, some things just cannot be done in software for this stuff.

--
Jens Axboe

2003-05-13 15:20:44

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Bartlomiej Zolnierkiewicz wrote:
> On Mon, 12 May 2003, Oliver Neukum wrote:
>
> > > Just a note that we have found TCQ unusable on our IBM drives and we had
> > > some reports about TCQ unusable on some WD drives.
> > >
> > > Unusable means severe FS corruptions starting from mount.
> > > So if your FSs will suddenly start to break, start looking for cause with
> > > disabling TCQ, please.
> >
> > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69

Oliver, what hardware are you reproducing this on? The DTLA should work.

--
Jens Axboe

2003-05-13 15:26:38

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, May 12 2003, Alan Cox wrote:
> On Llu, 2003-05-12 at 14:16, Bartlomiej Zolnierkiewicz wrote:
> > TCQ is marked EXPERIMENTAL and is known to be broken.
> > Probably it should be marked DANGEROUS or removed?
> >
> > Alan, what do you think?
>
> If not then the drivers with their own request end code also need
> fixing. [snip]

Please expand on this one, thanks. As far as I can see, we handle
private ide_dma_end just fine.

--
Jens Axboe

2003-05-13 17:12:07

by Oliver Neukum

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

Am Dienstag, 13. Mai 2003 17:32 schrieb Jens Axboe:
> On Mon, May 12 2003, Bartlomiej Zolnierkiewicz wrote:
> > On Mon, 12 May 2003, Oliver Neukum wrote:
> > > > Just a note that we have found TCQ unusable on our IBM drives and we
> > > > had some reports about TCQ unusable on some WD drives.
> > > >
> > > > Unusable means severe FS corruptions starting from mount.
> > > > So if your FSs will suddenly start to break, start looking for cause
> > > > with disabling TCQ, please.
> > >
> > > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
>
> Oliver, what hardware are you reproducing this on? The DTLA should work.

Athlon XP1600. But I am not reproducing this. I dare not. Is it important
enough to set up a scratch monkey? hdb did not show corruption. The raid
controller of the motherboard isn't used. APIC was enabled, ACPI wasn't.
The exact configuration died with the filesystem, sorry.

Regards
Oliver

oenone:/home/oliver # lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333]
00:01.0 PCI bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333 AGP]
00:05.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
00:06.0 RAID bus controller: Promise Technology, Inc. PDC20276 IDE (rev 01)
00:07.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000
Controller (PHY/Link)
00:09.0 USB Controller: VIA Technologies, Inc. USB (rev 50)
00:09.1 USB Controller: VIA Technologies, Inc. USB (rev 50)
00:09.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 51)
00:0d.0 SCSI storage controller: Tekram Technology Co.,Ltd. TRM-S1040 (rev 01)
00:0e.0 Network controller: AVM Audiovisuelles MKTG & Computer System GmbH
Fritz!PCI v2.0 ISDN (rev 01)
00:0f.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture
(rev 11)
00:0f.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev
11)
00:10.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8233A ISA Bridge
00:11.1 IDE interface: VIA Technologies, Inc. VT82C586/B/686A/B PIPC Bus
Master IDE (rev 06)
00:11.2 USB Controller: VIA Technologies, Inc. USB (rev 23)
00:11.3 USB Controller: VIA Technologies, Inc. USB (rev 23)
01:00.0 VGA compatible controller: ATI Technologies Inc Radeon VE QY

oenone:/home/oliver # hdparm -i /dev/hda

/dev/hda:

Model=IBM-DTLA-307045, FwRev=TX6OA60A, SerialNo=YMCYMT3Y229
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=90069840
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5
AdvancedPM=yes: disabled (255) WriteCache=enabled
Drive conforms to: ATA/ATAPI-5 T13 1321D revision 1: 2 3 4 5

oenone:/home/oliver # hdparm -i /dev/hdb

/dev/hdb:

Model=IBM-DTTA-351010, FwRev=T56OA73A, SerialNo=WF0KFV79157
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=34
BuffType=DualPortCache, BuffSize=466kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=19807200
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 *udma2
AdvancedPM=no WriteCache=enabled
Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17: 1 2 3 4

oenone:/home/oliver # cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 6
model name : AMD Athlon(TM) XP 1600+
stepping : 2
cpu MHz : 1059.468
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips : 2110.25

2003-05-13 17:15:59

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Oliver Neukum wrote:
> Am Dienstag, 13. Mai 2003 17:32 schrieb Jens Axboe:
> > On Mon, May 12 2003, Bartlomiej Zolnierkiewicz wrote:
> > > On Mon, 12 May 2003, Oliver Neukum wrote:
> > > > > Just a note that we have found TCQ unusable on our IBM drives and we
> > > > > had some reports about TCQ unusable on some WD drives.
> > > > >
> > > > > Unusable means severe FS corruptions starting from mount.
> > > > > So if your FSs will suddenly start to break, start looking for cause
> > > > > with disabling TCQ, please.
> > > >
> > > > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > > > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69
> >
> > Oliver, what hardware are you reproducing this on? The DTLA should work.
>
> Athlon XP1600. But I am not reproducing this. I dare not. Is it important
> enough to set up a scratch monkey? hdb did not show corruption. The raid
> controller of the motherboard isn't used. APIC was enabled, ACPI wasn't.
> The exact configuration died with the filesystem, sorry.

You don't have to reproduce, your case has two drives on a channel doing
tcq. That's not really supported, and the last patch sent should make
that scenario "work" (by not enabling tcq on any of them).

The DTTA, according to FreeBSD, has a bug with > 64K transfers. But you
said that worked fine, so...

Thanks for the feedback, much appreciated! FWIW, 2 hours of thrashing
the drive here and no problems so far.

--
Jens Axboe

2003-05-13 17:51:12

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13, 2003 at 07:00:20PM +0100, Dave Jones wrote:
> On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > > Weird. Mine doesn't seem to assert it, nor does the identify page
> > > indicate it's supported. Maybe I have a broken drive firmware.
> >
> > Then the linux code won't work on it, have you tried? I've tried a lot
> > of different IBM models, they all do service interrupts just fine.
>
> bug in the firmware version on Jeffs drives perhaps ?

I'll check it out. The answer to Jens' question is no, I've haven't
tried his TCQ stuff on this drive yet. Just poked and prodded it a lot
on my own :)

Jeff



2003-05-13 17:48:08

by Dave Jones

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > Weird. Mine doesn't seem to assert it, nor does the identify page
> > indicate it's supported. Maybe I have a broken drive firmware.
>
> Then the linux code won't work on it, have you tried? I've tried a lot
> of different IBM models, they all do service interrupts just fine.

bug in the firmware version on Jeffs drives perhaps ?

Dave

2003-05-13 17:51:25

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Dave Jones wrote:
> On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > > Weird. Mine doesn't seem to assert it, nor does the identify page
> > > indicate it's supported. Maybe I have a broken drive firmware.
> >
> > Then the linux code won't work on it, have you tried? I've tried a lot
> > of different IBM models, they all do service interrupts just fine.
>
> bug in the firmware version on Jeffs drives perhaps ?

It's possible, it would help a lot of Jeff would answer the question
above and maybe even share what drive he is using with us.

--
Jens Axboe

2003-05-13 17:52:49

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13, 2003 at 08:03:34PM +0200, Jens Axboe wrote:
> On Tue, May 13 2003, Dave Jones wrote:
> > On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > > > Weird. Mine doesn't seem to assert it, nor does the identify page
> > > > indicate it's supported. Maybe I have a broken drive firmware.
> > >
> > > Then the linux code won't work on it, have you tried? I've tried a lot
> > > of different IBM models, they all do service interrupts just fine.
> >
> > bug in the firmware version on Jeffs drives perhaps ?
>
> It's possible, it would help a lot of Jeff would answer the question
> above and maybe even share what drive he is using with us.

hehe, just did (answer: no). I'll post hdparm -I for it tomorrow.

Jeff



2003-05-13 17:55:10

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Jeff Garzik wrote:
> On Tue, May 13, 2003 at 08:03:34PM +0200, Jens Axboe wrote:
> > On Tue, May 13 2003, Dave Jones wrote:
> > > On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > > > > Weird. Mine doesn't seem to assert it, nor does the identify page
> > > > > indicate it's supported. Maybe I have a broken drive firmware.
> > > >
> > > > Then the linux code won't work on it, have you tried? I've tried a lot
> > > > of different IBM models, they all do service interrupts just fine.
> > >
> > > bug in the firmware version on Jeffs drives perhaps ?
> >
> > It's possible, it would help a lot of Jeff would answer the question
> > above and maybe even share what drive he is using with us.
>
> hehe, just did (answer: no). I'll post hdparm -I for it tomorrow.

:) thanks! fwiw, I've tried DTLA, DPTA, and the IC vancouvers here.

--
Jens Axboe

2003-05-13 18:29:19

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13, 2003 at 08:13:37PM +0200, Jens Axboe wrote:
> btw, you may want to see the IDE_TCQ_FIDDLE_SI define in ide-tcq, here's
> the comment I put there:
>
> /*
> * we are leaving the SERVICE interrupt alone, IBM drives have it
> * on per default and it can't be turned off. Doesn't matter, this
> * is the sane config.
> */
> #undef IDE_TCQ_FIDDLE_SI
>
> Are you sure this isn't what you are seeing?


My information comes solely from IDENTIFY DEVICE...

Jeff



2003-05-13 18:02:01

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Jens Axboe wrote:
> On Tue, May 13 2003, Jeff Garzik wrote:
> > On Tue, May 13, 2003 at 08:03:34PM +0200, Jens Axboe wrote:
> > > On Tue, May 13 2003, Dave Jones wrote:
> > > > On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > > > > > Weird. Mine doesn't seem to assert it, nor does the identify page
> > > > > > indicate it's supported. Maybe I have a broken drive firmware.
> > > > >
> > > > > Then the linux code won't work on it, have you tried? I've tried a lot
> > > > > of different IBM models, they all do service interrupts just fine.
> > > >
> > > > bug in the firmware version on Jeffs drives perhaps ?
> > >
> > > It's possible, it would help a lot of Jeff would answer the question
> > > above and maybe even share what drive he is using with us.
> >
> > hehe, just did (answer: no). I'll post hdparm -I for it tomorrow.
>
> :) thanks! fwiw, I've tried DTLA, DPTA, and the IC vancouvers here.

btw, you may want to see the IDE_TCQ_FIDDLE_SI define in ide-tcq, here's
the comment I put there:

/*
* we are leaving the SERVICE interrupt alone, IBM drives have it
* on per default and it can't be turned off. Doesn't matter, this
* is the sane config.
*/
#undef IDE_TCQ_FIDDLE_SI

Are you sure this isn't what you are seeing?

--
Jens Axboe

2003-05-13 18:33:01

by Oliver Neukum

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled


> You don't have to reproduce, your case has two drives on a channel doing
> tcq. That's not really supported, and the last patch sent should make
> that scenario "work" (by not enabling tcq on any of them).

Is this a principal problem?

> The DTTA, according to FreeBSD, has a bug with > 64K transfers. But you
> said that worked fine, so...

It wasn't written to.

Regards
Oliver

2003-05-13 18:53:31

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Oliver Neukum wrote:
>
> > You don't have to reproduce, your case has two drives on a channel doing
> > tcq. That's not really supported, and the last patch sent should make
> > that scenario "work" (by not enabling tcq on any of them).
>
> Is this a principal problem?

Yes. Without dedicated hardware support, it's too ugly to support. So I
don't want to :)

> > The DTTA, according to FreeBSD, has a bug with > 64K transfers. But you
> > said that worked fine, so...
>
> It wasn't written to.

Ok

--
Jens Axboe

2003-05-13 19:57:49

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Mon, 12 May 2003, Oleg Drokin wrote:

> How do you think people will test code that is removed?

The people most likely to fix it know there's a problem, why leave it
around to corrupt filesystems? Leave the code if Jens thinks he will get
to it before 2.6, but comment out the option until he does.

> Or do you mean that nobody plans to look at this ever?

Jens plans to, but there are other things on his plate.

> I remember that Jens Axboe promised to take a look at it some
> months ago.


On Mon, 12 May 2003, Jeff Garzik wrote:

> On Mon, May 12, 2003 at 11:58:10AM -0600, Mudama, Eric wrote:
> > The only difference between SATA TCQ and PATA TCQ is that in PATA TCQ, the
> > drive doesn't report the active tag bitmap back to the host after each
> > command. Other than that they are functionally identical to my
> > understanding. (Yes, there are options like first-party DMA, but these are
> > not requirements)
>
> That's from the "drive side." From the OS side, the ideal
> implementation isn't here yet :)
>
> Ideally there is a DMA ring of taskfiles and scatterlists. The OS
> (producer) queues these up asynchrously, and the host+devices
> (consumer) executes the taskfiles in the ring. AHCI does this.
>
> With PATA TCQ, we only have a single scatterlist, and are forced to
> have more OS-side infrastructure for command queueing, processing, etc.
>
> As an aside, as drives and hosts get faster, we will actually want
> _fewer_ interrupts (i.e. interrupt coalescing).
>
> All this points to making the host smarter.
> The drives are already pretty damn smart ;-)

Unfortunately it depends on the drive actually working if it claims to
support the feature. That seems to be a problem.


On Mon, 12 May 2003, Mudama, Eric wrote:

> TCQ shouldn't benefit writes significantly from a performance perspective if
> the drive is reasonably smart. TCQ *will* have a huge performance
> improvement for random reads since the drive can order responses based on
> minimal rotational latency.
>
> Increasing queue depth reduces the average seek time between commands, both
> in distance and rotational latency. Provided a drive doesn't do dumb stuff
> like we discussed earlier, then it should be good.

One problem which seems probable is that the drive knows less about the
system than the o/s (I hope!) and therefore it can only optimize the order
of i/o for most i/o in the shortest time. It would seem that the deadline
scheduler benefits from doing not the quickest thing but the correct thing
in terms of ordering. I believe that once the i/o is queued (assuming the
drive works right) the drive makes the decision about i/o order. That may
be the wrong thing to do under load, and starve some processes.

There was discussion recently about limiting the requests with SCSI, for
just this reason.

Unless there's a *lot* of gain from doing TCQ, perhaps this should either
wait, be dropped, or only be enabled for a whitelist of known actually
functional drives. Seems like a poor risk to benefit ratio if it doesn't
work just right, and perhaps this should go on the "it seemed like a good
idea at the time" pile. There's nothing the code can do to guard against
bad drive firmware except not use it.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2003-05-13 19:59:14

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled


Why are we still dorking around with device TCQ.
There are three holes in the state machine.
IBM's design (goat-screw) is lamer than a duck.
Maxtor thought about redoing TCQ, to not leave the host in a daze but
dropped the ball.

Nobody cares about a broken pile of crap in the NCITS standard, otherwise
the rest of the drive vendors would have adopted.

Stop with drive side crappola and make it host side for SATA.

If you want TCQ go use another OS, and kiss your data good bye.

Not a single OS (linux included) can deal with a error in flush cache,
much less an error from a previous tagged request.

Don't do it, and if you do, don't bitch.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

PS Jens this is not directed to you, just this was the fatest cc list to
bang a drum on.

On Tue, 13 May 2003, Jens Axboe wrote:

> On Tue, May 13 2003, Jens Axboe wrote:
> > On Tue, May 13 2003, Jeff Garzik wrote:
> > > On Tue, May 13, 2003 at 08:03:34PM +0200, Jens Axboe wrote:
> > > > On Tue, May 13 2003, Dave Jones wrote:
> > > > > On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > > > > > > Weird. Mine doesn't seem to assert it, nor does the identify page
> > > > > > > indicate it's supported. Maybe I have a broken drive firmware.
> > > > > >
> > > > > > Then the linux code won't work on it, have you tried? I've tried a lot
> > > > > > of different IBM models, they all do service interrupts just fine.
> > > > >
> > > > > bug in the firmware version on Jeffs drives perhaps ?
> > > >
> > > > It's possible, it would help a lot of Jeff would answer the question
> > > > above and maybe even share what drive he is using with us.
> > >
> > > hehe, just did (answer: no). I'll post hdparm -I for it tomorrow.
> >
> > :) thanks! fwiw, I've tried DTLA, DPTA, and the IC vancouvers here.
>
> btw, you may want to see the IDE_TCQ_FIDDLE_SI define in ide-tcq, here's
> the comment I put there:
>
> /*
> * we are leaving the SERVICE interrupt alone, IBM drives have it
> * on per default and it can't be turned off. Doesn't matter, this
> * is the sane config.
> */
> #undef IDE_TCQ_FIDDLE_SI
>
> Are you sure this isn't what you are seeing?
>
> --
> Jens Axboe
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-05-13 20:18:56

by Mudama, Eric

[permalink] [raw]
Subject: RE: 2.5.69, IDE TCQ can't be enabled



-----Original Message-----
>From: Andre Hedrick [mailto:[email protected]]
>Sent: Tuesday, May 13, 2003 2:04 PM
>To: Jens Axboe
>Cc: Jeff Garzik; Dave Jones; Mudama, Eric; Oleg Drokin; Bartlomiej
>Zolnierkiewicz; Alan Cox; Oliver Neukum; [email protected];
>[email protected]
>Subject: Re: 2.5.69, IDE TCQ can't be enabled
>
>Why are we still dorking around with device TCQ.
>There are three holes in the state machine.
>IBM's design (goat-screw) is lamer than a duck.
>Maxtor thought about redoing TCQ, to not leave the host in a daze but
>dropped the ball.

What ball, exactly, did we drop? We have yet to ship a TCQ drive because we
think we can get it "right" but it is taking us time, in our view, to do
properly.

--eric

2003-05-13 20:16:36

by Mudama, Eric

[permalink] [raw]
Subject: RE: 2.5.69, IDE TCQ can't be enabled


-----Original Message-----
From: Bill Davidsen [mailto:[email protected]]
Sent: Tuesday, May 13, 2003 2:05 PM
To: Oleg Drokin; Jeff Garzik; Mudama, Eric
Cc: 'Jens Axboe'; Alan Cox; Linux Kernel Mailing List
Subject: Re: 2.5.69, IDE TCQ can't be enabled

>On Mon, 12 May 2003, Mudama, Eric wrote:
>
>> TCQ shouldn't benefit writes significantly from a performance perspective
if
>> the drive is reasonably smart. TCQ *will* have a huge performance
>> improvement for random reads since the drive can order responses based on
>> minimal rotational latency.
>>
>> Increasing queue depth reduces the average seek time between commands,
both
>> in distance and rotational latency. Provided a drive doesn't do dumb
stuff
>> like we discussed earlier, then it should be good.
>
>One problem which seems probable is that the drive knows less about the
>system than the o/s (I hope!) and therefore it can only optimize the order
>of i/o for most i/o in the shortest time. It would seem that the deadline
>scheduler benefits from doing not the quickest thing but the correct thing
>in terms of ordering. I believe that once the i/o is queued (assuming the
>drive works right) the drive makes the decision about i/o order. That may
>be the wrong thing to do under load, and starve some processes.

The general case we use is to optimize for maximum ops-per-second. The
potential benefit from queueing is indicated by the delta between random
write performance (cached operations we can order for performance) versus
random read performance (cache misses we can't choose the order of). In a
virtual drive with near-infinite queue depth, rotational latency completely
factors out so the you get equal ops/sec from a 7200 RPM drive as you do
from a 15K RPM drive. That is what we are shooting for.

>There was discussion recently about limiting the requests with SCSI, for
>just this reason.
>
>Unless there's a *lot* of gain from doing TCQ, perhaps this should either
>wait, be dropped, or only be enabled for a whitelist of known actually
>functional drives. Seems like a poor risk to benefit ratio if it doesn't
>work just right, and perhaps this should go on the "it seemed like a good
>idea at the time" pile. There's nothing the code can do to guard against
>bad drive firmware except not use it.

I am working here to test usability with TCQ on our prototypes so I can
evaluate these sorts of queue depth issues and whether I think we're doing
the right stuff algorithmically in the drive. Hopefully some of my effort
helps.

--eric

2003-05-13 20:24:10

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled



This is the last time I got TAG running clean!
Proof you have zero gain on writes and huge gains on reads.

Still it is a lame protocol.

On Tue, 13 May 2003, Jeff Garzik wrote:

> On Tue, May 13, 2003 at 08:13:37PM +0200, Jens Axboe wrote:
> > btw, you may want to see the IDE_TCQ_FIDDLE_SI define in ide-tcq, here's
> > the comment I put there:
> >
> > /*
> > * we are leaving the SERVICE interrupt alone, IBM drives have it
> > * on per default and it can't be turned off. Doesn't matter, this
> > * is the sane config.
> > */
> > #undef IDE_TCQ_FIDDLE_SI
> >
> > Are you sure this isn't what you are seeing?
>
>
> My information comes solely from IDENTIFY DEVICE...
>
> Jeff
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group


Attachments:
ide-2.3.99-pre9-tag.dmesg (5.69 kB)

2003-05-13 20:29:40

by Andre Hedrick

[permalink] [raw]
Subject: RE: 2.5.69, IDE TCQ can't be enabled


Correct no drives shipped.

Maxtor failed to move forward in updating the device side TCQ.
I will have to look up my email record to give you who I was working with
and talking to about the issues. I do know it was in the Longmont
division.

The last discussion I had with Maxtor releated in forcing the state
machine to default to bus release on execution of the command block.
This would fix the the gaping hole in the protocol. The effects from this
single change were not fully white board material, and may have addressed
or covered the other two pitfalls.

Drop the ball, meaning it was started and then tabled.
Tabled because of SATA/SAS.

Comments?


Andre Hedrick
LAD Storage Consulting Group


On Tue, 13 May 2003, Mudama, Eric wrote:

>
>
> -----Original Message-----
> >From: Andre Hedrick [mailto:[email protected]]
> >Sent: Tuesday, May 13, 2003 2:04 PM
> >To: Jens Axboe
> >Cc: Jeff Garzik; Dave Jones; Mudama, Eric; Oleg Drokin; Bartlomiej
> >Zolnierkiewicz; Alan Cox; Oliver Neukum; [email protected];
> >[email protected]
> >Subject: Re: 2.5.69, IDE TCQ can't be enabled
> >
> >Why are we still dorking around with device TCQ.
> >There are three holes in the state machine.
> >IBM's design (goat-screw) is lamer than a duck.
> >Maxtor thought about redoing TCQ, to not leave the host in a daze but
> >dropped the ball.
>
> What ball, exactly, did we drop? We have yet to ship a TCQ drive because we
> think we can get it "right" but it is taking us time, in our view, to do
> properly.
>
> --eric
>

2003-05-13 20:31:28

by Mudama, Eric

[permalink] [raw]
Subject: RE: 2.5.69, IDE TCQ can't be enabled



-----Original Message-----
>From: Andre Hedrick [mailto:[email protected]]
>Sent: Tuesday, May 13, 2003 2:29 PM
>To: Jeff Garzik
>Cc: Jens Axboe; Dave Jones; Mudama, Eric; Oleg Drokin; Bartlomiej
>Zolnierkiewicz; Alan Cox; Oliver Neukum; [email protected];
>[email protected]
>Subject: Re: 2.5.69, IDE TCQ can't be enabled
>
>This is the last time I got TAG running clean!
>Proof you have zero gain on writes and huge gains on reads.

Of course there's no gain on writes with write cache enabled, that is
obvious:

It is nearly impossible for a drive to cache random reads, therefore they
have the greatest performance penalty due to seeks and rotational latency.
That is why queueing improves random reads so much.

Repetitive and Sequential reads should see no benefit at all from queueing,
since the virtually-zero-size drive cache actually hits its locality
(spatial or temporal) cases for these commands.

Similarly, any write with the drive's write cache enabled will see zero or
near-zero benefit from queueing writes. The only reason to use queued
writes is so you can intermingle them with queued reads without flushing
your tags.

If you disable write cache on the drive (Journalling/RAID environments) then
you'll see performance nearly identical to reads, which then can benefit by
queueing to the same degree.

>Still it is a lame protocol.

I don't necessarilly disagree, however I'm not on T13 and didn't have any
input.

--eric

2003-05-13 23:42:30

by Alan

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Maw, 2003-05-13 at 21:03, Andre Hedrick wrote:
> Not a single OS (linux included) can deal with a error in flush cache,
> much less an error from a previous tagged request.

To be reasonable its not clear what you can do when a flush cache fails. The
only cases I can see you can handle anything intelligently are drive side but
even those are not clear. If the drive flushes all the blocks it can get
to disk its really the same as a fatal write error except we have less idea how
to recover and have already lost the data.

For the cases it matters you turn off write cache and we (maybe SATA time mostly)
get tcq working properly.

This is the same issue as with SCSI. SCSI has a whole rats nest of things that
seem to exist solely to screw up error recovery 8)

2003-05-14 06:51:34

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Andre Hedrick wrote:
>
>
> This is the last time I got TAG running clean!

Remember I saw that code, and it it wasn't doing any queueing at all (ie
you had tcq enabled, but it didn't work).

> Proof you have zero gain on writes and huge gains on reads.

Which I think we already discussed. Besides there's no proof there at
all, you just dumped a bonnie run. What does that prove?

--
Jens Axboe

2003-05-14 07:03:24

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.5.69, IDE TCQ can't be enabled

On Tue, May 13 2003, Jeff Garzik wrote:
> On Tue, May 13, 2003 at 08:13:37PM +0200, Jens Axboe wrote:
> > btw, you may want to see the IDE_TCQ_FIDDLE_SI define in ide-tcq, here's
> > the comment I put there:
> >
> > /*
> > * we are leaving the SERVICE interrupt alone, IBM drives have it
> > * on per default and it can't be turned off. Doesn't matter, this
> > * is the sane config.
> > */
> > #undef IDE_TCQ_FIDDLE_SI
> >
> > Are you sure this isn't what you are seeing?
>
>
> My information comes solely from IDENTIFY DEVICE...

Maybe you shouldn't trust that, then :-)

Seriously, I suppose it depends on the drive or maybe that IBM
interprets the bits differently. This puppy:

IC35L040AVVA07-0

says it doesn't support it either, but I can damn well assure you that
it does. What it doesn't support (like other IBM's) is trying to change
it, then it complains.

So I'd be willing to bet lots of money that you drive generates service
interrupts just fine. As I said, I've yet to see one that doesn't.

--
Jens Axboe

Subject: Re: 2.5.69, IDE TCQ can't be enabled

Dave Jones <[email protected]> writes:

>On Tue, May 13, 2003 at 08:40:59AM +0200, Jens Axboe wrote:
> > > Weird. Mine doesn't seem to assert it, nor does the identify page
> > > indicate it's supported. Maybe I have a broken drive firmware.
> >
> > Then the linux code won't work on it, have you tried? I've tried a lot
> > of different IBM models, they all do service interrupts just fine.

>bug in the firmware version on Jeffs drives perhaps ?

As he has an old firmware on the drive (which he really should
upgrade, else it will eat his drive sooner or later), this might well
be possible.

> > I can confirm that. This drive Model=IBM-DTLA-307045, FwRev=TX6OA60A,
> > SerialNo=YMCYMT3Y229 has eaten my filesystem with TCQ on 2.5.69

Most current I got my fingers on is A6AA (last four letters of the
FwRev). I'd recommend an update. Even better: Sell the drive on eBay.

Regards
Henning

--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen INTERMETA GmbH
[email protected] +49 9131 50 654 0 http://www.intermeta.de/

Java, perl, Solaris, Linux, xSP Consulting, Web Services
freelance consultant -- Jakarta Turbine Development -- hero for hire