2003-09-26 12:47:08

by MånsRullgård

[permalink] [raw]
Subject: [BUG?] SIS IDE DMA errors


I reported this a long time ago, but nobody seemed to care, so here it
is again.

With all 2.6.0 versions so far, I get these errors when writing lots
of data to the disk:

hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }

hda: DMA disabled
ide0: reset: success
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }

hda: DMA disabled
ide0: reset: success
Losing too many ticks!
TSC cannot be used as a timesource. (Are you running with SpeedStep?)
Falling back to a sane timesource.
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest }
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }

hda: drive not ready for command

It only happens when I write more than about 100 MB at more than 5
MB/s or so, never when writing smaller amounts of data.

System details follow.

00:00.0 Host bridge: Silicon Integrated Systems [SiS] 650 Host (rev 01)
00:01.0 PCI bridge: Silicon Integrated Systems [SiS] SiS 530 Virtual PCI-to-PCI bridge (AGP)
00:02.0 ISA bridge: Silicon Integrated Systems [SiS] 85C503/5513 (rev 10)
00:02.1 SMBus: Silicon Integrated Systems [SiS]: Unknown device 0016
00:02.2 USB Controller: Silicon Integrated Systems [SiS] SiS7001 USB Controller (rev 07)
00:02.3 USB Controller: Silicon Integrated Systems [SiS] SiS7001 USB Controller (rev 07)
00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0)
00:02.7 Multimedia audio controller: Silicon Integrated Systems [SiS] SiS7012 PCI Audio Accelerator (rev a0)
00:03.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 10/100 Ethernet (rev 90)
00:0a.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev a8)
00:0a.1 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev a8)
00:0a.2 FireWire (IEEE 1394): Ricoh Co Ltd R5C552 IEEE 1394 Controller
00:0c.0 Communication controller: Conexant HSF 56k HSFi Modem (rev 01)
01:00.0 VGA compatible controller: Silicon Integrated Systems [SiS] SiS650/651/M650/740 PCI/AGP VGA Display Adapter

00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
Subsystem: Asustek Computer, Inc.: Unknown device 1688
Flags: bus master, fast devsel, latency 128
I/O ports at b800 [size=16]

Linux version 2.6.0-test5-nick15 (mru@ford) (gcc version 3.3.1) #14 Fri Sep 19 11:26:47 CEST 2003
Video mode to be used for restore is ffff
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000000dffa000 (usable)
BIOS-e820: 000000000dffa000 - 000000000dfff000 (ACPI data)
BIOS-e820: 000000000dfff000 - 000000000e000000 (ACPI NVS)
BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
223MB LOWMEM available.
On node 0 totalpages: 57338
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 53242 pages, LIFO batch:12
HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
ACPI: RSDP (v000 ASUS ) @ 0x000f6460
ACPI: RSDT (v001 ASUS M2000E 0x42302e31 MSFT 0x31313031) @ 0x0dffa000
ACPI: FADT (v001 ASUS M2000E 0x42302e31 MSFT 0x31313031) @ 0x0dffa080
ACPI: BOOT (v001 ASUS M2000E 0x42302e31 MSFT 0x31313031) @ 0x0dffa040
ACPI: DSDT (v001 ASUS M2000E 0x00001000 MSFT 0x0100000e) @ 0x00000000
ACPI: MADT not present
Building zonelist for node : 0
Kernel command line: root=/dev/hda1 init=/sbin/lvm2/lvmroot video=sisfb:mode:1024x768x8
sisfb: Options mode:1024x768x8
No local APIC present or hardware disabled
Initializing CPU#0
PID hash table entries: 1024 (order 10: 8192 bytes)
Detected 2070.560 MHz processor.
Console: colour VGA+ 80x25
Memory: 223688k/229352k available (1655k kernel code, 4972k reserved, 744k data, 136k init, 0k highmem)
Calibrating delay loop... 4087.80 BogoMIPS
Security Scaffold v1.0.0 initialized
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
-> /dev
-> /dev/console
-> /root
CPU: After generic identify, caps: bfebf9ff 00000000 00000000 00000000
CPU: After vendor identify, caps: bfebf9ff 00000000 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: After all inits, caps: bfebf9ff 00000000 00000000 00000080
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU#0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU#0: Thermal monitoring enabled
CPU: Intel Mobile Intel(R) Pentium(R) 4 - M CPU 1.80GHz stepping 07
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xf17c0, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20030813
tbxface-0117 [03] acpi_load_tables : ACPI Tables successfully acquired
Parsing all Control Methods:........................................................................................................................................................................................................
Table [DSDT](id F004) - 585 Objects with 52 Devices 200 Methods 21 Regions
ACPI Namespace successfully loaded at root c038b39c
evxfevnt-0093 [04] acpi_enable : Transition to ACPI mode successful
evgpeblk-0748 [06] ev_create_gpe_block : GPE 00 to 15 [_GPE] 2 regs at 000000000000E420 on int 9
evgpeblk-0748 [06] ev_create_gpe_block : GPE 16 to 31 [_GPE] 2 regs at 000000000000E430 on int 9
Completing Region/Field/Buffer/Package initialization:.............................................. psargs-0352: *** Error: Looking up [\_PR_.CPU0] in namespace, AE_NOT_FOUND
search_node cdf6f8a8 start_node cdf6f8a8 return_node 00000000
psparse-1121: *** Error: , AE_NOT_FOUND

nsinit-0293 [06] ns_init_one_object : Could not execute arguments for [_PSL] (Package), AE_NOT_FOUND
....................
Initialized 21/21 Regions 5/5 Fields 19/19 Buffers 21/21 Packages (593 nodes)
Executing all Device _STA and_INI methods:.....................................................
53 Devices found containing: 53 _STA, 2 _INI methods
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 10 *11 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 15)
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
Enabling SiS 96x SMBus.
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: Power Resource [FN0] (on)
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 9
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 5
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
PCI: Using ACPI for IRQ routing
PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off'
sisfb: Video ROM found and mapped to c00c0000
sisfb: Framebuffer at 0xf0000000, mapped to 0xce80f000, size 32768k
sisfb: MMIO at 0xe7800000, mapped to 0xd0810000, size 128k
sisfb: Memory heap starting at 12288K
sisfb: Using MMIO queue mode
sisfb: LVDS transmitter detected
sisfb: Default mode is 1024x768x8 (60Hz)
sisfb: Added MTRRs
sisfb: Installed SISFB_GET_INFO ioctl (80046ef8)
sisfb: 2D acceleration is enabled, scrolling mode ypan
fb0: SIS 650/M650/651/740 VGA frame buffer device, Version 1.6.01
sisfb: Change mode to 1024x768x8-60Hz
Console: switching to colour frame buffer device 128x48
pty: 256 Unix98 ptys configured
SBF: Simple Boot Flag extension found and enabled.
SBF: Setting boot flags 0x1
Machine check exception polling timer started.
cpufreq: P4/Xeon(TM) CPU On-Demand Clock Modulation available
Total HugeTLB memory allocated, 0
devfs: v1.22 (20021013) Richard Gooch ([email protected])
devfs: boot_options: 0x1
ACPI: AC Adapter [AC] (on-line)
ACPI: Battery Slot [BAT0] (battery present)
ACPI: Power Button (FF) [PWRF]
ACPI: Lid Switch [LID]
ACPI: Sleep Button (CM) [SLPB]
ACPI: Fan [FAN0] (on)
ACPI: Processor [CPU] (supports C1 C2, 4 throttling states)
ACPI: Thermal Zone [THRM] (45 C)
Asus Laptop ACPI Extras version 0.24a
M2E model detected, supported
Notify Handler installed successfully
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SIS5513: IDE controller at PCI slot 0000:00:02.5
SIS5513: chipset revision 208
SIS5513: not 100% native mode: will probe irqs later
SIS5513: SiS 961 MuTIOL IDE UDMA100 controller
ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:DMA, hdd:pio
hda: IC25N040ATMR04-0, ATA DISK drive
Using anticipatory scheduling io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: ASUS SCB-2408, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 1024KiB
hda: 78140160 sectors (40007 MB) w/1740KiB Cache, CHS=16383/255/63, UDMA(100)
/dev/ide/host0/bus0/target0/lun0: p1 p2 p3
Console: switching to colour frame buffer device 128x48
mice: PS/2 mouse device common for all mice
i8042.c: Detected active multiplexing controller, rev 1.1.
serio: i8042 AUX0 port at 0x60,0x64 irq 12
serio: i8042 AUX1 port at 0x60,0x64 irq 12
serio: i8042 AUX2 port at 0x60,0x64 irq 12
Synaptics Touchpad, model: 1
Firware: 4.6
180 degree mounted touchpad
Sensor: 18
new absolute packet format
Touchpad has extended capability bits
-> four buttons
-> multifinger detection
-> palm detection
input: Synaptics Synaptics TouchPad on isa0060/serio4
serio: i8042 AUX3 port at 0x60,0x64 irq 12
input: AT Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
device-mapper: 4.0.0-ioctl (2003-06-04) initialised: [email protected]
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 32768)
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S0 S1 S3 S4 S5)
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Mounted devfs on /dev
Freeing unused kernel memory: 136k freed
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on hda1, internal journal
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
Adding 522104k swap on /dev/hda3. Priority:-1 extents:1
EXT3 FS on dm-3, internal journal
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-4, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-6, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-8, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on dm-9, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Real Time Clock Driver v1.12
Linux Kernel Card Services 3.1.22
options: [pci] [cardbus] [pm]
Yenta: CardBus bridge found at 0000:00:0a.0 [1043:1687]
Yenta: ISA IRQ list 0098, PCI irq10
Socket status: 30000006
Yenta: CardBus bridge found at 0000:00:0a.1 [1043:1687]
Yenta: ISA IRQ list 0098, PCI irq10
Socket status: 30000006
sis900.c: v1.08.06 9/24/2002
eth0: ICS LAN PHY transceiver found at address 1.
eth0: Using transceiver found at address 1 as default
eth0: SiS 900 PCI Fast Ethernet at 0xa000, IRQ 5, 00:0c:6e:40:b0:22.
eth0: Media Link On 100mbps full-duplex
intel8x0: clocking to 48000
drivers/usb/core/usb.c: registered new driver usbfs
drivers/usb/core/usb.c: registered new driver hub
ohci-hcd: 2003 Feb 24 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ohci-hcd: block sizes: ed 64 td 64
ohci-hcd 0000:00:02.2: OHCI Host Controller
ohci-hcd 0000:00:02.2: irq 9, pci mem d08b6000
ohci-hcd 0000:00:02.2: new USB bus registered, assigned bus number 1
hub 1-0:0: USB hub found
hub 1-0:0: 3 ports detected
ohci-hcd 0000:00:02.3: OHCI Host Controller
ohci-hcd 0000:00:02.3: irq 9, pci mem d091e000
ohci-hcd 0000:00:02.3: new USB bus registered, assigned bus number 2
hub 2-0:0: USB hub found
hub 2-0:0: 3 ports detected
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected SiS 650 chipset
agpgart: Maximum main memory to use for agp memory: 176M
agpgart: AGP aperture is 64M @ 0xe8000000
hub 1-0:0: debounce: port 1: delay 100ms stable 4 status 0x301
hub 1-0:0: new USB device on port 1, assigned address 2
input: USB HID v1.10 Mouse [Logitech Optical USB Mouse] on usb-0000:00:02.2-1
drivers/usb/core/usb.c: registered new driver hid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
SCSI subsystem initialized
end_request: I/O error, dev hdc, sector 0
hdc: ATAPI 24X DVD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12
EXT3 FS on dm-1, internal journal
vmmon: no version magic, tainting kernel.
vmmon: module license 'unspecified' taints kernel.
/dev/vmmon: Module vmmon: registered with major=10 minor=165
/dev/vmmon: Module vmmon: initialized
/dev/vmmon: Module vmmon: unloaded
vmnet: no version magic, tainting kernel.
vmnet: module license 'unspecified' taints kernel.
vmmon: no version magic, tainting kernel.
vmmon: module license 'unspecified' taints kernel.
/dev/vmmon: Module vmmon: registered with major=10 minor=165
/dev/vmmon: Module vmmon: initialized
vmnet: no version magic, tainting kernel.
vmnet: module license 'unspecified' taints kernel.
/dev/vmnet: open called by PID 1997 (vmnet-bridge)
/dev/vmnet: hub 0 does not exist, allocating memory.
/dev/vmnet: port on hub 0 successfully opened
bridge-eth0: up
bridge-eth0: attached
/dev/vmnet: open called by PID 2015 (vmnet-natd)
/dev/vmnet: hub 8 does not exist, allocating memory.
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 2270 (vmnet-netifup)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 2288 (vmnet-dhcpd)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 2342 (vmware-vmx)
/dev/vmnet: port on hub 8 successfully opened
spurious 8259A interrupt: IRQ7.
/dev/vmnet: open called by PID 2407 (vmware-vmx)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 2414 (vmware-vmx)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 2471 (vmware-vmx)
/dev/vmnet: port on hub 8 successfully opened
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }

hda: DMA disabled
ide0: reset: success
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }

hda: DMA disabled
ide0: reset: success
/dev/vmnet: open called by PID 2509 (vmware-vmx)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 2516 (vmware-vmx)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 2523 (vmware-vmx)
/dev/vmnet: port on hub 8 successfully opened
acpi_bus-0199 [17] acpi_bus_set_power : Device is not power manageable
acpi_thermal-0611 [16] acpi_thermal_active : Unable to turn cooling device [cdf6fd28] 'on'
acpi_bus-0199 [17] acpi_bus_set_power : Device is not power manageable
acpi_thermal-0611 [16] acpi_thermal_active : Unable to turn cooling device [cdf6fd28] 'on'
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }

hda: DMA disabled
ide0: reset: success
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }

hda: DMA disabled
ide0: reset: success
Losing too many ticks!
TSC cannot be used as a timesource. (Are you running with SpeedStep?)
Falling back to a sane timesource.
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest }
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }

hda: drive not ready for command
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0xd0 { Busy }

hda: DMA disabled
ide0: reset: success


--
M?ns Rullg?rd
[email protected]


2003-09-26 14:10:11

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

> Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> SIS5513: IDE controller at PCI slot 0000:00:02.5
> SIS5513: chipset revision 208
> SIS5513: not 100% native mode: will probe irqs later
> SIS5513: SiS 961 MuTIOL IDE UDMA100 controller
> ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio
> ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:DMA, hdd:pio
> hda: IC25N040ATMR04-0, ATA DISK drive

ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SIS5513: IDE controller at PCI slot 00:02.5
PCI: Found IRQ 10 for device 00:02.5
SIS5513: chipset revision 0
SIS5513: not 100% native mode: will probe irqs later
SIS5513: SiS 962/963 MuTIOL IDE UDMA133 controller
ide0: BM-DMA at 0x4000-0x4007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0x4008-0x400f, BIOS settings: hdc:pio, hdd:pio
hda: IC35L090AVV207-0, ATA DISK drive


Jul 27 04:22:26 mhfl4 kernel: hda: lost interrupt
Jul 27 04:23:15 mhfl4 kernel: hda: dma_timer_expiry: dma status == 0x24
Jul 27 04:23:25 mhfl4 kernel: hda: DMA interrupt recovery

Running mostly 2.4 on this board, not using ACPI, Got similar problems
with 2.4 and when running occasionally 2.6, but not as bad except with
2.4.22-pre7.

Suspect chipset related issue which should be looked into.

You could try setting udma mode with hdparm -Xudma[12345] and see
if it helps.

I use from a script on startup

sync
hdparm -S 255 -K1 -c3 -Xudma5 /dev/hda.

Note: IME, hdparm should not be used when there is substantial
disk activity.

Regards
Michael


2003-09-26 14:27:40

by MånsRullgård

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

Michael Frank <[email protected]> writes:

> Suspect chipset related issue which should be looked into.

That's what someone told me three months ago, too. Nothing happened,
though.

> You could try setting udma mode with hdparm -Xudma[12345] and see
> if it helps.

That's the first thing I try when things go wrong. It didn't help
this time.

> I use from a script on startup
>
> sync
> hdparm -S 255 -K1 -c3 -Xudma5 /dev/hda.

I already tried all most combinations. The only thing that helps is
-d0, if you can call that help.

> Note: IME, hdparm should not be used when there is substantial
> disk activity.

I've noticed. It usually causes the system too freeze for a minute or
so.

I'm a little frustrated at not being able copy large files without
considerable trouble.

--
M?ns Rullg?rd
[email protected]

2003-09-26 15:33:04

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Friday 26 September 2003 22:07, M?ns Rullg?rd wrote:
> Michael Frank <[email protected]> writes:
>
> > Suspect chipset related issue which should be looked into.
>
> That's what someone told me three months ago, too. Nothing happened,
> though.
>

OK, now that we are two, we copy the IDE maintainer ;)

I guess it is fair to say that we are happy to test patches.

And here is my lspci -vv.

00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (prog-if 80 [Master])
Subsystem: Micro-Star International Co., Ltd.: Unknown device 5332
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 128
Interrupt: pin ? routed to IRQ 10
Region 4: I/O ports at 4000 [size=16]
Capabilities: <available only to root>


Regards
Michael

2003-09-26 16:46:13

by MånsRullgård

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

Michael Frank <[email protected]> writes:

>> > Suspect chipset related issue which should be looked into.
>>
>> That's what someone told me three months ago, too. Nothing happened,
>> though.
>>
>
> OK, now that we are two, we copy the IDE maintainer ;)
>
> I guess it is fair to say that we are happy to test patches.
>
> And here is my lspci -vv.
>
> 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (prog-if 80 [Master])
> Subsystem: Micro-Star International Co., Ltd.: Unknown device 5332
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 128
> Interrupt: pin ? routed to IRQ 10
> Region 4: I/O ports at 4000 [size=16]
> Capabilities: <available only to root>

Mine looks rather similar, but there are a few differences. Mine has
Mem+ and DEVSEL=fast.

00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
Subsystem: Asustek Computer, Inc.: Unknown device 1688
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 128
Region 4: I/O ports at b800 [size=16]



--
M?ns Rullg?rd
[email protected]

2003-09-26 17:01:14

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Fri, Sep 26, 2003 at 11:32:30PM +0800, Michael Frank wrote:
> On Friday 26 September 2003 22:07, M?ns Rullg?rd wrote:
> > Michael Frank <[email protected]> writes:
> >
> > > Suspect chipset related issue which should be looked into.
> >
> > That's what someone told me three months ago, too. Nothing happened,
> > though.
> >
>
> OK, now that we are two, we copy the IDE maintainer ;)

Actually, it's me who wrote the 961 and 963 support. It works fine for
most people. Did you check you cabling?

>
> I guess it is fair to say that we are happy to test patches.
>
> And here is my lspci -vv.
>
> 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (prog-if 80 [Master])
> Subsystem: Micro-Star International Co., Ltd.: Unknown device 5332
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 128
> Interrupt: pin ? routed to IRQ 10
> Region 4: I/O ports at 4000 [size=16]
> Capabilities: <available only to root>
>
>
> Regards
> Michael
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2003-09-26 17:47:32

by MånsRullgård

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

Vojtech Pavlik <[email protected]> writes:

> Actually, it's me who wrote the 961 and 963 support. It works fine for
> most people. Did you check you cabling?

I'm dealing with a laptop, but I suppose I could wiggle the cables a
bit. I still doubt it's a cable problem, since reading works
flawlessly.

It appears to me that during heavy IO load, some DMA interrupts get
lost, for some reason.

--
M?ns Rullg?rd
[email protected]

2003-09-26 17:55:17

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Fri, Sep 26, 2003 at 07:27:35PM +0200, M?ns Rullg?rd wrote:
> Vojtech Pavlik <[email protected]> writes:
>
> > Actually, it's me who wrote the 961 and 963 support. It works fine for
> > most people. Did you check you cabling?
>
> I'm dealing with a laptop, but I suppose I could wiggle the cables a
> bit. I still doubt it's a cable problem, since reading works
> flawlessly.

Hmm, that's indeed interesting and it'd point to a driver problem -
when reading, the drive is dictating the timing, but when writing, it's
the controllers turn.

So if the controller timing is not correctly programmed, reads function,
but writes don't.

Can you send me the output of 'lspci -vvxxx' of the IDE device?
I'll take a look to see if it looks correct.

> It appears to me that during heavy IO load, some DMA interrupts get
> lost, for some reason.

Well, I've got this feeling that not just IDE interrupts get lost under
heavy IO load with recent kernels ...

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2003-09-26 18:05:47

by MånsRullgård

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

Vojtech Pavlik <[email protected]> writes:

>> > Actually, it's me who wrote the 961 and 963 support. It works fine for
>> > most people. Did you check you cabling?
>>
>> I'm dealing with a laptop, but I suppose I could wiggle the cables a
>> bit. I still doubt it's a cable problem, since reading works
>> flawlessly.
>
> Hmm, that's indeed interesting and it'd point to a driver problem -

See, I told you :)

> when reading, the drive is dictating the timing, but when writing, it's
> the controllers turn.
>
> So if the controller timing is not correctly programmed, reads function,
> but writes don't.

Furthermore, short writes work just fine. The errors usually start
happening after about 100 MB at full speed. When copying from NFS
over a 100 MB/s network it usually goes a little longer, sometimes
even up to 500 MB. All this could indicate that there is some error
in the timing, and that it takes some time for it build up enough to
trigger the bad things. Or am I wrong?

Why can't the drive give notice when it's ready to accept more data?
That would seem like the simple solution, instead of trying to
synchronize the timers.

> Can you send me the output of 'lspci -vvxxx' of the IDE device?
> I'll take a look to see if it looks correct.

Here you go:

00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
Subsystem: Asustek Computer, Inc.: Unknown device 1688
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 128
Region 4: I/O ports at b800 [size=16]
00: 39 10 13 55 07 00 00 00 d0 80 01 01 00 80 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 01 b8 00 00 00 00 00 00 00 00 00 00 43 10 88 16
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
40: 31 81 00 00 31 85 00 00 08 01 e6 51 00 02 00 02
50: 01 00 01 06 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


>> It appears to me that during heavy IO load, some DMA interrupts get
>> lost, for some reason.
>
> Well, I've got this feeling that not just IDE interrupts get lost under
> heavy IO load with recent kernels ...

Like mouse and keyboard...

--
M?ns Rullg?rd
[email protected]

2003-09-26 18:29:51

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Saturday 27 September 2003 01:53, Vojtech Pavlik wrote:
> On Fri, Sep 26, 2003 at 07:27:35PM +0200, M?ns Rullg?rd wrote:
> > Vojtech Pavlik <[email protected]> writes:
> >
> > > Actually, it's me who wrote the 961 and 963 support. It works fine for
> > > most people. Did you check you cabling?
> >
> > I'm dealing with a laptop, but I suppose I could wiggle the cables a
> > bit. I still doubt it's a cable problem, since reading works
> > flawlessly.
>
> Hmm, that's indeed interesting and it'd point to a driver problem -
> when reading, the drive is dictating the timing, but when writing, it's
> the controllers turn.
>
> So if the controller timing is not correctly programmed, reads function,
> but writes don't.
>
> Can you send me the output of 'lspci -vvxxx' of the IDE device?
> I'll take a look to see if it looks correct.


00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (prog-if 80 [Master])
Subsystem: Micro-Star International Co., Ltd.: Unknown device 5332
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 128
Interrupt: pin ? routed to IRQ 10
Region 4: I/O ports at 4000 [size=16]
Capabilities: [58] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 39 10 13 55 05 00 10 02 00 80 01 01 00 80 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 01 40 00 00 00 00 00 00 00 00 00 00 62 14 32 53
30: 00 00 00 00 58 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 06 00 00 00 00 00
50: f2 07 f2 07 ea 96 d5 d0 01 00 02 86 00 00 00 00
60: ff aa ff aa 00 00 00 00 00 00 00 00 00 00 00 00
70: 17 21 06 04 00 60 1c 1e 00 60 1c 1e 00 60 1c 1e
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


>
> > It appears to me that during heavy IO load, some DMA interrupts get
> > lost, for some reason.
>
> Well, I've got this feeling that not just IDE interrupts get lost under
> heavy IO load with recent kernels ...

Timer interrupts too as clocks seem to run slow only
on a number of machines.

Regards
Michael

Regards
Michael

2003-09-26 18:24:31

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Saturday 27 September 2003 01:27, M?ns Rullg?rd wrote:
> Vojtech Pavlik <[email protected]> writes:
>
> > Actually, it's me who wrote the 961 and 963 support. It works fine for
> > most people. Did you check you cabling?
>
> I'm dealing with a laptop, but I suppose I could wiggle the cables a
> bit. I still doubt it's a cable problem, since reading works
> flawlessly.

And the cables are nicely short...

>
> It appears to me that during heavy IO load, some DMA interrupts get
> lost, for some reason.

Could you cat /proc/interrupts to see if the IDE interrupt is shared
by chance?

Regards
Michael


2003-09-26 18:24:39

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Saturday 27 September 2003 00:59, Vojtech Pavlik wrote:
> On Fri, Sep 26, 2003 at 11:32:30PM +0800, Michael Frank wrote:
> > On Friday 26 September 2003 22:07, M?ns Rullg?rd wrote:
> > > Michael Frank <[email protected]> writes:
> > >
> > > > Suspect chipset related issue which should be looked into.
> > >
> > > That's what someone told me three months ago, too. Nothing happened,
> > > though.
> > >
> >
> > OK, now that we are two, we copy the IDE maintainer ;)
>
> Actually, it's me who wrote the 961 and 963 support. It works fine for
> most people. Did you check you cabling?

It is 80 conductor cable.

2.4.22-pre7 was terrible for me - since pre9 or so it has almost gone away,
the cable is the same. These days I get the message only once a week or so.

Interesting is that we both use IBM drives albeit different generations.

I found the udma setting affects speed but not the dma-timer-expiry problem.

/dev/hda:

ATA device, with non-removable media
powers-up in standby; SET FEATURES subcmd spins-up.
Model Number: IC35L090AVV207-0
Serial Number: VNVC00G3CABSMD
Firmware Revision: V23OA63A
Standards:
Used: ATA/ATAPI-6 T13 1410D revision 3a
Supported: 6 5 4 3
Configuration:
Logical max current
cylinders 16383 65535
heads 16 1
sectors/track 63 63
--
CHS current addressable sectors: 4128705
LBA user addressable sectors: 160836480
LBA48 user addressable sectors: 160836480
device size with M = 1024*1024: 78533 MBytes
device size with M = 1000*1000: 82348 MBytes (82 GB)
Capabilities:
LBA, IORDY(can be disabled)
bytes avail on r/w long: 52 Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
Advanced power management level: unknown setting (0x0000)
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=240ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* NOP cmd
* READ BUFFER cmd
* WRITE BUFFER cmd
* Host Protected Area feature set
Release interrupt
* Look-ahead
* Write cache
* Power Management feature set
Security Mode feature set
* SMART feature set
* FLUSH CACHE EXT command
* Mandatory FLUSH CACHE command
* Device Configuration Overlay feature set
* 48-bit Address feature set
Automatic Acoustic Management feature set
SET MAX security extension
Address Offset Reserved Area Boot
SET FEATURES subcommand required to spinup after power up
Power-Up In Standby feature set
Advanced Power Management feature set
* READ/WRITE DMA QUEUED
* General Purpose Logging feature set
* SMART self-test
* SMART error logging
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
46min for SECURITY ERASE UNIT.
HW reset results:
CBLID- above Vih
Device num = 0 determined by the jumper
Checksum: correct


> > Regards
> > Michael

2003-09-26 18:34:13

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Fri, Sep 26, 2003 at 07:46:03PM +0200, M?ns Rullg?rd wrote:
> Vojtech Pavlik <[email protected]> writes:
>
> >> > Actually, it's me who wrote the 961 and 963 support. It works fine for
> >> > most people. Did you check you cabling?
> >>
> >> I'm dealing with a laptop, but I suppose I could wiggle the cables a
> >> bit. I still doubt it's a cable problem, since reading works
> >> flawlessly.
> >
> > Hmm, that's indeed interesting and it'd point to a driver problem -
>
> See, I told you :)
>
> > when reading, the drive is dictating the timing, but when writing, it's
> > the controllers turn.
> >
> > So if the controller timing is not correctly programmed, reads function,
> > but writes don't.
>
> Furthermore, short writes work just fine. The errors usually start
> happening after about 100 MB at full speed. When copying from NFS
> over a 100 MB/s network it usually goes a little longer, sometimes
> even up to 500 MB. All this could indicate that there is some error
> in the timing, and that it takes some time for it build up enough to
> trigger the bad things. Or am I wrong?

Well, yes. There's nothing to build up. There are no two timers to
synchronize - basically the controller sends the data at a certain speed
and the drive must be able to understand the data at that speed. So, if
you configure the controller to UDMA133 and the drive can only do
UDMA100, it'll fail sooner or later. It doesn't necessarily fail
immediately, since the drive has some margin above its engineered speed
that it'll be able to receive.

> Why can't the drive give notice when it's ready to accept more data?

It does, it does. The problem would only occur if the signalling rate
was too high for the driver to receive it. If the drive's buffers are
full, it'll signal the controller to delay sending, but first the data
must reach the buffer.

> That would seem like the simple solution, instead of trying to
> synchronize the timers.

There fortunately are no timers to be synchronized. However, you can't
do the handshake at every single byte, that'd slow down the transfers
considerablt.

>
> > Can you send me the output of 'lspci -vvxxx' of the IDE device?
> > I'll take a look to see if it looks correct.
>
> Here you go:

Thanks.

> 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
> Subsystem: Asustek Computer, Inc.: Unknown device 1688
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 128
> Region 4: I/O ports at b800 [size=16]
> 00: 39 10 13 55 07 00 00 00 d0 80 01 01 00 80 80 00
> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 20: 01 b8 00 00 00 00 00 00 00 00 00 00 43 10 88 16
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 40: 31 81 00 00 31 85 00 00 08 01 e6 51 00 02 00 02

Ok, this means:

31 - hda: 90ns data active time, 30 ns data recovery time (PIO4)
41 - hda: UDMA enabled, UDMA mode 5 (UDMA100)
00 - hdb: 240ns/360ns (PIO0) - no drive present
00 - hdb: UDMA disabled
31 - hdc: 90ns/30ns PIO4
85 - hdc: UDMA enabled, UDMA mode 2 (UDMA33)
00 - hdd: 240ns/360ns (PIO0) - no drive present
00 - hdd: UDMA disabled

So the config is correct if you have /dev/hda your harddrive, that's
capable of UDMA100 and /dev/hdc a CDROM and capable of UDMA33. Is that
right?

08 - 80-wire cables (needed for UDMA44 and higher) NOT installed.
FIFO threshold set to 3/4 for read and to 1/4 for write.

01 - IDE controller in compatibility mode. Native and test modes
disabled. (normal)

e6 - PCI burst enable, EDB R-R pipeline enable, Fast postwrite enable,
device ID masqueraded as sis5513 (although real is 5517)
channels 0 and 1 enabled in normal mode

51 - Postwrite enabled on hda and hdc, prefetch on hda only

00 02 - 512 bytes prefetch size for hda
00 02 - 512 bytes prefetch size for hdc

All this is OK, possibly except for the 80-wire cable not being present,
but if this is a notebook, there might be a completely different cable
type than what's standard, and the detection might not work there.

> 50: 01 00 01 06 00 00 00 00 00 00 00 00 00 00 00 00
> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

> >> It appears to me that during heavy IO load, some DMA interrupts get
> >> lost, for some reason.
> >
> > Well, I've got this feeling that not just IDE interrupts get lost under
> > heavy IO load with recent kernels ...
>
> Like mouse and keyboard...

Like everything. But only for mouse, keyboard, timer and ide it HURTS.

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2003-09-26 19:21:04

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Saturday 27 September 2003 02:33, Vojtech Pavlik wrote:
> On Fri, Sep 26, 2003 at 07:46:03PM +0200, M?ns Rullg?rd wrote:
> > Vojtech Pavlik <[email protected]> writes:
> >
> > >> > Actually, it's me who wrote the 961 and 963 support. It works fine for
> > >> > most people. Did you check you cabling?
> > >>
> > >> I'm dealing with a laptop, but I suppose I could wiggle the cables a
> > >> bit. I still doubt it's a cable problem, since reading works
> > >> flawlessly.
> > >
> > > Hmm, that's indeed interesting and it'd point to a driver problem -
> >
> > See, I told you :)
> >
> > > when reading, the drive is dictating the timing, but when writing, it's
> > > the controllers turn.
> > >
> > > So if the controller timing is not correctly programmed, reads function,
> > > but writes don't.
> >
> > Furthermore, short writes work just fine. The errors usually start
> > happening after about 100 MB at full speed. When copying from NFS
> > over a 100 MB/s network it usually goes a little longer, sometimes
> > even up to 500 MB. All this could indicate that there is some error
> > in the timing, and that it takes some time for it build up enough to
> > trigger the bad things. Or am I wrong?
>
> Well, yes. There's nothing to build up. There are no two timers to
> synchronize - basically the controller sends the data at a certain speed
> and the drive must be able to understand the data at that speed. So, if
> you configure the controller to UDMA133 and the drive can only do
> UDMA100, it'll fail sooner or later. It doesn't necessarily fail
> immediately, since the drive has some margin above its engineered speed
> that it'll be able to receive.
>
> > Why can't the drive give notice when it's ready to accept more data?
>
> It does, it does. The problem would only occur if the signalling rate
> was too high for the driver to receive it. If the drive's buffers are
> full, it'll signal the controller to delay sending, but first the data
> must reach the buffer.
>
> > That would seem like the simple solution, instead of trying to
> > synchronize the timers.
>
> There fortunately are no timers to be synchronized. However, you can't
> do the handshake at every single byte, that'd slow down the transfers
> considerablt.
>
> >
> > > Can you send me the output of 'lspci -vvxxx' of the IDE device?
> > > I'll take a look to see if it looks correct.
> >
> > Here you go:
>
> Thanks.
>
> > 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
> > Subsystem: Asustek Computer, Inc.: Unknown device 1688
> > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> > Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> > Latency: 128
> > Region 4: I/O ports at b800 [size=16]
> > 00: 39 10 13 55 07 00 00 00 d0 80 01 01 00 80 80 00
> > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 20: 01 b8 00 00 00 00 00 00 00 00 00 00 43 10 88 16
> > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 40: 31 81 00 00 31 85 00 00 08 01 e6 51 00 02 00 02
>
> Ok, this means:
>
> 31 - hda: 90ns data active time, 30 ns data recovery time (PIO4)
> 41 - hda: UDMA enabled, UDMA mode 5 (UDMA100)
> 00 - hdb: 240ns/360ns (PIO0) - no drive present
> 00 - hdb: UDMA disabled
> 31 - hdc: 90ns/30ns PIO4
> 85 - hdc: UDMA enabled, UDMA mode 2 (UDMA33)
> 00 - hdd: 240ns/360ns (PIO0) - no drive present
> 00 - hdd: UDMA disabled
>
> So the config is correct if you have /dev/hda your harddrive, that's
> capable of UDMA100 and /dev/hdc a CDROM and capable of UDMA33. Is that
> right?
>
> 08 - 80-wire cables (needed for UDMA44 and higher) NOT installed.
> FIFO threshold set to 3/4 for read and to 1/4 for write.
>
> 01 - IDE controller in compatibility mode. Native and test modes
> disabled. (normal)
>
> e6 - PCI burst enable, EDB R-R pipeline enable, Fast postwrite enable,
> device ID masqueraded as sis5513 (although real is 5517)
> channels 0 and 1 enabled in normal mode
>
> 51 - Postwrite enabled on hda and hdc, prefetch on hda only
>
> 00 02 - 512 bytes prefetch size for hda
> 00 02 - 512 bytes prefetch size for hdc
>
> All this is OK, possibly except for the 80-wire cable not being present,
> but if this is a notebook, there might be a completely different cable
> type than what's standard, and the detection might not work there.
>
> > 50: 01 00 01 06 00 00 00 00 00 00 00 00 00 00 00 00
> > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> > >> It appears to me that during heavy IO load, some DMA interrupts get
> > >> lost, for some reason.
> > >
> > > Well, I've got this feeling that not just IDE interrupts get lost under
> > > heavy IO load with recent kernels ...
> >
> > Like mouse and keyboard...
>
> Like everything. But only for mouse, keyboard, timer and ide it HURTS.
>
> --
> Vojtech Pavlik
> SuSE Labs, SuSE CR
>
>

Was running 2.4.22.

Now running 2.6.0-test5. Fresh boot.

00:0f.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RT8139
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (8000ns min, 16000ns max)
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at e000 [size=256]
Region 1: Memory at eb102000 (32-bit, non-prefetchable) [size=256]
Expansion ROM at <unassigned> [disabled] [size=64K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: ec 10 39 81 07 00 90 02 10 00 00 02 00 20 00 00
10: 01 e0 00 00 00 20 10 eb 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 ec 10 39 81
30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 20 40
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

I am surprised at your analysis of the pci bus data. By what you
stated my drive(r) should be doing PIO ;)

50: 01 00 c2 f7 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

/dev/hda:
Timing buffer-cache reads: 128 MB in 0.36 seconds =352.67 MB/sec
Timing buffered disk reads: 64 MB in 1.20 seconds = 53.25 MB/sec
[root@mhfl4 03:10:20 mhf]# v ht

/dev/hda:
Timing buffer-cache reads: 128 MB in 0.37 seconds =346.00 MB/sec
Timing buffered disk reads: 64 MB in 1.20 seconds = 53.21 MB/sec

It does 53MB/s and by earlier drive info as mailed drive reports set to udma5.

Regards
Michael









2003-09-26 19:44:38

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Friday 26 September 2003 23:38, M?ns Rullg?rd wrote:
> Michael Frank <[email protected]> writes:
>
> >> > Suspect chipset related issue which should be looked into.
> >>
> >> That's what someone told me three months ago, too. Nothing happened,
> >> though.
> >>
> >
> > OK, now that we are two, we copy the IDE maintainer ;)
> >
> > I guess it is fair to say that we are happy to test patches.
> >
> > And here is my lspci -vv.
> >
> > 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (prog-if 80 [Master])
> > Subsystem: Micro-Star International Co., Ltd.: Unknown device 5332
> > Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> > Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> > Latency: 128
> > Interrupt: pin ? routed to IRQ 10
> > Region 4: I/O ports at 4000 [size=16]
> > Capabilities: <available only to root>
>
> Mine looks rather similar, but there are a few differences. Mine has
> Mem+ and DEVSEL=fast.
>
> 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
> Subsystem: Asustek Computer, Inc.: Unknown device 1688
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 128
> Region 4: I/O ports at b800 [size=16]
>
>
>

Here is my ATA config of 2.6.0-test5. Could you please send same ex your .config. I will build it and see if anything changes.


#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_IDEDISK_STROKE is not set
CONFIG_BLK_DEV_IDECS=m
CONFIG_BLK_DEV_IDECD=m
# CONFIG_BLK_DEV_IDETAPE is not set
CONFIG_BLK_DEV_IDEFLOPPY=m
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_TASKFILE_IO=y

#
# IDE chipset support/bugfixes
#
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_OPTI621=y
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDE_TCQ is not set
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_IDEDMA_PCI_WIP is not set
CONFIG_BLK_DEV_ADMA=y
CONFIG_BLK_DEV_AEC62XX=y
CONFIG_BLK_DEV_ALI15X3=y
CONFIG_WDC_ALI15X3=y
CONFIG_BLK_DEV_AMD74XX=y
CONFIG_BLK_DEV_CMD64X=y
CONFIG_BLK_DEV_TRIFLEX=y
CONFIG_BLK_DEV_CY82C693=y
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
CONFIG_BLK_DEV_HPT34X=y
CONFIG_BLK_DEV_HPT366=y
CONFIG_BLK_DEV_SC1200=y
CONFIG_BLK_DEV_PIIX=y
CONFIG_BLK_DEV_NS87415=y
CONFIG_BLK_DEV_PDC202XX_OLD=y
# CONFIG_PDC202XX_BURST is not set
CONFIG_BLK_DEV_PDC202XX_NEW=y
# CONFIG_PDC202XX_FORCE is not set
CONFIG_BLK_DEV_SVWKS=y
CONFIG_BLK_DEV_SIIMAGE=y
CONFIG_BLK_DEV_SIS5513=y
CONFIG_BLK_DEV_SLC90E66=y
CONFIG_BLK_DEV_TRM290=y
CONFIG_BLK_DEV_VIA82CXXX=y
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_DMA_NONPCI is not set
# CONFIG_BLK_DEV_HD is not set



Regards
Michael

2003-09-27 06:18:00

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Sat, Sep 27, 2003 at 03:19:37AM +0800, Michael Frank wrote:

> > > 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
> > > Subsystem: Asustek Computer, Inc.: Unknown device 1688
> > > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> > > Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> > > Latency: 128
> > > Region 4: I/O ports at b800 [size=16]
> > > 00: 39 10 13 55 07 00 00 00 d0 80 01 01 00 80 80 00
> > > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > 20: 01 b8 00 00 00 00 00 00 00 00 00 00 43 10 88 16
> > > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > 40: 31 81 00 00 31 85 00 00 08 01 e6 51 00 02 00 02
> >
> > Ok, this means:
> >
> > 31 - hda: 90ns data active time, 30 ns data recovery time (PIO4)
> > 41 - hda: UDMA enabled, UDMA mode 5 (UDMA100)
> > 00 - hdb: 240ns/360ns (PIO0) - no drive present
> > 00 - hdb: UDMA disabled
> > 31 - hdc: 90ns/30ns PIO4
> > 85 - hdc: UDMA enabled, UDMA mode 2 (UDMA33)
> > 00 - hdd: 240ns/360ns (PIO0) - no drive present
> > 00 - hdd: UDMA disabled
>
> Was running 2.4.22.
>
> Now running 2.6.0-test5. Fresh boot.
>
> 00:0f.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
> Subsystem: Realtek Semiconductor Co., Ltd. RT8139
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 32 (8000ns min, 16000ns max)
> Interrupt: pin A routed to IRQ 11
> Region 0: I/O ports at e000 [size=256]
> Region 1: Memory at eb102000 (32-bit, non-prefetchable) [size=256]
> Expansion ROM at <unassigned> [disabled] [size=64K]
> Capabilities: [50] Power Management version 2
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 00: ec 10 39 81 07 00 90 02 10 00 00 02 00 20 00 00
> 10: 01 e0 00 00 00 20 10 eb 00 00 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 ec 10 39 81
> 30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 20 40
> 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> I am surprised at your analysis of the pci bus data. By what you
> stated my drive(r) should be doing PIO ;)

1) You're looking at your ethernet controller config registers,
not at the IDE controller config registers.

2) The 961 and 963 have completely different config register layout.
Actually, there is not much common between the 961 and the 963,
except for the '5513' fake ID. (The 961's real id is 5517, and the
963's is 5518).

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2003-09-27 06:40:57

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Saturday 27 September 2003 14:13, Vojtech Pavlik wrote:
> On Sat, Sep 27, 2003 at 03:19:37AM +0800, Michael Frank wrote:
>
> > > > 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
> > > > Subsystem: Asustek Computer, Inc.: Unknown device 1688
> > > > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> > > > Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> > > > Latency: 128
> > > > Region 4: I/O ports at b800 [size=16]
> > > > 00: 39 10 13 55 07 00 00 00 d0 80 01 01 00 80 80 00
> > > > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > 20: 01 b8 00 00 00 00 00 00 00 00 00 00 43 10 88 16
> > > > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > 40: 31 81 00 00 31 85 00 00 08 01 e6 51 00 02 00 02
> > >
> > > Ok, this means:
> > >
> > > 31 - hda: 90ns data active time, 30 ns data recovery time (PIO4)
> > > 41 - hda: UDMA enabled, UDMA mode 5 (UDMA100)
> > > 00 - hdb: 240ns/360ns (PIO0) - no drive present
> > > 00 - hdb: UDMA disabled
> > > 31 - hdc: 90ns/30ns PIO4
> > > 85 - hdc: UDMA enabled, UDMA mode 2 (UDMA33)
> > > 00 - hdd: 240ns/360ns (PIO0) - no drive present
> > > 00 - hdd: UDMA disabled
> >

> 1) You're looking at your ethernet controller config registers,
> not at the IDE controller config registers.

Oooooooooops, pasted the wrong one - was 3am ;)

Here it is with 2.4.22:

00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (prog-if 80 [Master])
Subsystem: Micro-Star International Co., Ltd.: Unknown device 5332
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 128
Interrupt: pin ? routed to IRQ 10
Region 4: I/O ports at 4000 [size=16]
Capabilities: [58] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 39 10 13 55 05 00 10 02 00 80 01 01 00 80 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 01 40 00 00 00 00 00 00 00 00 00 00 62 14 32 53
30: 00 00 00 00 58 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 06 00 00 00 00 00
50: f2 07 f2 07 ea 96 d5 d0 01 00 02 86 00 00 00 00
60: ff aa ff aa 00 00 00 00 00 00 00 00 00 00 00 00
70: 17 21 06 04 00 60 1c 1e 00 60 1c 1e 00 60 1c 1e
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

2.6.0-test5 is same

Regards
Michael

2003-09-29 09:41:58

by MånsRullgård

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

Vojtech Pavlik <[email protected]> writes:

>> > Can you send me the output of 'lspci -vvxxx' of the IDE device?
>> > I'll take a look to see if it looks correct.
>>
>> Here you go:
>
> Thanks.
>
>> 00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
>> Subsystem: Asustek Computer, Inc.: Unknown device 1688
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
>> Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
>> Latency: 128
>> Region 4: I/O ports at b800 [size=16]
>> 00: 39 10 13 55 07 00 00 00 d0 80 01 01 00 80 80 00
>> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 20: 01 b8 00 00 00 00 00 00 00 00 00 00 43 10 88 16
>> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 40: 31 81 00 00 31 85 00 00 08 01 e6 51 00 02 00 02
>
> Ok, this means:
>
> 31 - hda: 90ns data active time, 30 ns data recovery time (PIO4)
> 41 - hda: UDMA enabled, UDMA mode 5 (UDMA100)
> 00 - hdb: 240ns/360ns (PIO0) - no drive present
> 00 - hdb: UDMA disabled
> 31 - hdc: 90ns/30ns PIO4
> 85 - hdc: UDMA enabled, UDMA mode 2 (UDMA33)
> 00 - hdd: 240ns/360ns (PIO0) - no drive present
> 00 - hdd: UDMA disabled
>
> So the config is correct if you have /dev/hda your harddrive, that's
> capable of UDMA100 and /dev/hdc a CDROM and capable of UDMA33. Is that
> right?

That's it.

> 08 - 80-wire cables (needed for UDMA44 and higher) NOT installed.
> FIFO threshold set to 3/4 for read and to 1/4 for write.
>
> 01 - IDE controller in compatibility mode. Native and test modes
> disabled. (normal)
>
> e6 - PCI burst enable, EDB R-R pipeline enable, Fast postwrite enable,
> device ID masqueraded as sis5513 (although real is 5517)
> channels 0 and 1 enabled in normal mode
>
> 51 - Postwrite enabled on hda and hdc, prefetch on hda only
>
> 00 02 - 512 bytes prefetch size for hda
> 00 02 - 512 bytes prefetch size for hdc
>
> All this is OK, possibly except for the 80-wire cable not being present,
> but if this is a notebook, there might be a completely different cable
> type than what's standard, and the detection might not work there.

I've got no idea what the cable is like. Is there anything to be
learned from opening the beast? Anything in particular to look for?

--
M?ns Rullg?rd
[email protected]

2003-09-29 10:02:23

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Mon, Sep 29, 2003 at 11:22:28AM +0200, M?ns Rullg?rd wrote:

> > 08 - 80-wire cables (needed for UDMA44 and higher) NOT installed.
> > FIFO threshold set to 3/4 for read and to 1/4 for write.
> >
> > 01 - IDE controller in compatibility mode. Native and test modes
> > disabled. (normal)
> >
> > e6 - PCI burst enable, EDB R-R pipeline enable, Fast postwrite enable,
> > device ID masqueraded as sis5513 (although real is 5517)
> > channels 0 and 1 enabled in normal mode
> >
> > 51 - Postwrite enabled on hda and hdc, prefetch on hda only
> >
> > 00 02 - 512 bytes prefetch size for hda
> > 00 02 - 512 bytes prefetch size for hdc
> >
> > All this is OK, possibly except for the 80-wire cable not being present,
> > but if this is a notebook, there might be a completely different cable
> > type than what's standard, and the detection might not work there.
>
> I've got no idea what the cable is like. Is there anything to be
> learned from opening the beast? Anything in particular to look for?

Not really, sorry.

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2003-09-29 11:18:17

by Lionel Bouton

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

Vojtech Pavlik said the following on 09/26/2003 07:53 PM:

>On Fri, Sep 26, 2003 at 07:27:35PM +0200, M?ns Rullg?rd wrote:
>
>
>>Vojtech Pavlik <[email protected]> writes:
>>
>>
>>
>>>Actually, it's me who wrote the 961 and 963 support. It works fine for
>>>most people. Did you check you cabling?
>>>
>>>
>>I'm dealing with a laptop, but I suppose I could wiggle the cables a
>>bit. I still doubt it's a cable problem, since reading works
>>flawlessly.
>>
>>
>
>Hmm, that's indeed interesting and it'd point to a driver problem -
>when reading, the drive is dictating the timing, but when writing, it's
>the controllers turn.
>
>So if the controller timing is not correctly programmed, reads function,
>but writes don't.
>
>Can you send me the output of 'lspci -vvxxx' of the IDE device?
>I'll take a look to see if it looks correct.
>
>
>
>>It appears to me that during heavy IO load, some DMA interrupts get
>>lost, for some reason.
>>
>>
>
>Well, I've got this feeling that not just IDE interrupts get lost under
>heavy IO load with recent kernels ...
>
>
>

This could explain some odd reports. Amongst the usual causes like flaky
hardware, kernel misconfiguration and the likes, I encountered some
people for which IO-APIC support would throw their data away...

Now I always ask the users to recompile without IO-APIC, this usually
brings other problems (awful ethernet perfs for one user comes to my
mind) but tends to solve IDE instability.

Until today, I've not a single report where lspci -vxxx highlighted any
IDE register misconfiguration, AFAICS your code *is* correct Vojtech.

LB.

--
Lionel Bouton - inet6
---------------------------------------------------------------------
o Siege social: 51, rue de Verdun - 92158 Suresnes
/ _ __ _ Acces Bureaux: 33 rue Benoit Malon - 92150 Suresnes
/ /\ /_ / /_ France
\/ \/_ / /_/ Tel. +33 (0) 1 41 44 85 36
Inetsys S.A. Fax +33 (0) 1 46 97 20 10



2003-09-29 13:16:20

by Michael Frank

[permalink] [raw]
Subject: Re: [BUG?] SIS IDE DMA errors

On Monday 29 September 2003 17:23, M?ns Rullg?rd wrote:
> Michael Frank <[email protected]> writes:
>
> > Here is my ATA config of 2.6.0-test5. Could you please send same ex
> > your .config. I will build it and see if anything changes.
> >
>
> OK, attaching.
>
>

I patched your ATA config into my config, built it and tested it
with the attached script. More usage info inside script.

$ tstinter start # creates 2 400MB files

No problems seen, sorry this did not help.

You have no problems with recent 2.4 kernels?

Could you do also lspci -vvxxx using 2.4.22 and see if there is a
difference?, also stress it with the script if you can.

Regards
Michael


Attachments:
(No filename) (670.00 B)
tstinter (5.65 kB)
Download all attachments