2002-01-18 02:27:58

by Anton Altaparmakov

[permalink] [raw]
Subject: Linux 2.5.3-pre1-aia1

Since the new IDE core from Andre is now solid as reported by various
people on IRC, here is my local patch (stable for me) which you can apply
to play with the shiny new IDE core (IDE core fix is same as
ata-253p1-2.bz2 from Jens). (-:

Patch available from:

http://www-stu.christs.cam.ac.uk/~aia21/linux/patch-2.5.3-pre1-aia1
http://www-stu.christs.cam.ac.uk/~aia21/linux/patch-2.5.3-pre1-aia1.bz2
http://www-stu.christs.cam.ac.uk/~aia21/linux/patch-2.5.3-pre1-aia1.gz

Linux 2.5.3-pre1-aia1

o Fix new IDE core (Jens Axboe, Andre Hedrick)
+ Configure help entries for IDE (Andre Hedrick, Rob Radez, me)
+ Reduce NTFS vmalloc use (NTFS 1.1.22) (me)
o Compile fixes for dnotify (me)
o Compile fixes for via82cxxx (me)

Patches marked "+" have been submitted to Linus by me already.

Enjoy,

Anton


--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/


2002-01-18 17:27:32

by Davide Libenzi

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, 18 Jan 2002, Anton Altaparmakov wrote:

> Since the new IDE core from Andre is now solid as reported by various
> people on IRC, here is my local patch (stable for me) which you can apply
> to play with the shiny new IDE core (IDE core fix is same as
> ata-253p1-2.bz2 from Jens). (-:

I would like to say the same. I worked with the fixed kernel
2.5.3-pre1+ata-253p1-2 yesterday w/out problems. I rebootedt the machine
before leaving the office yesterday night and this morning it had a full
screen :

hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt

I have to say that something like :

All work and no play makes Jack a dull boy ...
All work and no play makes Jack a dull boy ...
All work and no play makes Jack a dull boy ...

would have scared me more, but still i think there's some tuning to play
with ...




- Davide


2002-01-18 19:06:05

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, Jan 18 2002, Davide Libenzi wrote:
> On Fri, 18 Jan 2002, Anton Altaparmakov wrote:
>
> > Since the new IDE core from Andre is now solid as reported by various
> > people on IRC, here is my local patch (stable for me) which you can apply
> > to play with the shiny new IDE core (IDE core fix is same as
> > ata-253p1-2.bz2 from Jens). (-:
>
> I would like to say the same. I worked with the fixed kernel
> 2.5.3-pre1+ata-253p1-2 yesterday w/out problems. I rebootedt the machine
> before leaving the office yesterday night and this morning it had a full
> screen :
>
> hda: lost interrupt
> hda: lost interrupt
> hda: lost interrupt
> hda: lost interrupt
> hda: lost interrupt

What mode? PIO and no multi mode, or?

--
Jens Axboe

2002-01-18 19:18:37

by Davide Libenzi

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, 18 Jan 2002, Jens Axboe wrote:

> On Fri, Jan 18 2002, Davide Libenzi wrote:
> > On Fri, 18 Jan 2002, Anton Altaparmakov wrote:
> >
> > > Since the new IDE core from Andre is now solid as reported by various
> > > people on IRC, here is my local patch (stable for me) which you can apply
> > > to play with the shiny new IDE core (IDE core fix is same as
> > > ata-253p1-2.bz2 from Jens). (-:
> >
> > I would like to say the same. I worked with the fixed kernel
> > 2.5.3-pre1+ata-253p1-2 yesterday w/out problems. I rebootedt the machine
> > before leaving the office yesterday night and this morning it had a full
> > screen :
> >
> > hda: lost interrupt
> > hda: lost interrupt
> > hda: lost interrupt
> > hda: lost interrupt
> > hda: lost interrupt
>
> What mode? PIO and no multi mode, or?


This is what reports me 2.5.2 :


[root@blue1 davide]# cat /proc/ide/hda/settings
name value min max mode
---- ----- --- --- ----
bios_cyl 2495 0 65535 rw
bios_head 255 0 255 rw
bios_sect 63 0 63 rw
breada_readahead 4 0 127 rw
bswap 0 0 1 r
current_speed 0 0 69 rw
failures 0 0 65535 rw
file_readahead 124 0 16384 rw
ide_scsi 0 0 1 rw
init_speed 0 0 69 rw
io_32bit 0 0 3 rw
keepsettings 0 0 1 rw
lun 0 0 7 rw
max_failures 1 0 65535 rw
multcount 8 0 8 rw
nice1 1 0 1 rw
nowerr 0 0 1 rw
number 0 0 3 rw
pio_mode write-only 0 255 w
slow 0 0 1 rw
unmaskirq 0 0 1 rw
using_dma 0 0 1 rw





Linux version 2.5.2-mqo1 ([email protected]) (gcc version 3.0.3) #12 Wed Jan 16 09:49:54 PST 2002
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 0000000010000000 (usable)
BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
On node 0 totalpages: 65536
zone(0): 4096 pages.
zone(1): 61440 pages.
zone(2): 0 pages.
Kernel command line: auto BOOT_IMAGE=2.5.2-mqo1 ro root=305 BOOT_FILE=/boot/vmlinuz-2.5.2-mqo1
Initializing CPU#0
Detected 999.554 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 1992.29 BogoMIPS
Memory: 255896k/262144k available (1229k kernel code, 5860k reserved, 341k data, 204k init, 0k highmem)
Dentry-cache hash table entries: 32768 (order: 6, 262144 bytes)
Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
Buffer-cache hash table entries: 16384 (order: 4, 65536 bytes)
Page-cache hash table entries: 65536 (order: 6, 262144 bytes)
CPU: Before vendor init, caps: 0183f9ff c1c7f9ff 00000000, vendor = 2
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After vendor init, caps: 0183f9ff c1c7f9ff 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After generic, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: Common caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: AMD Athlon(tm) Processor stepping 02
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch ([email protected])
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfb350, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router VIA [1106/0686] at 00:07.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Starting kswapd
BIO: pool of 256 setup, 14Kb (56 bytes/bio)
biovec: init pool 0, 1 entries, 12 bytes
biovec: init pool 1, 4 entries, 48 bytes
biovec: init pool 2, 16 entries, 192 bytes
biovec: init pool 3, 64 entries, 768 bytes
biovec: init pool 4, 128 entries, 1536 bytes
biovec: init pool 5, 256 entries, 3072 bytes
pty: 256 Unix98 ptys configured
Serial driver version 5.05c (2001-07-08) with MANY_PORTS SHARE_IRQ SERIAL_PCI enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
block: 256 slots per queue, batch=32
Uniform Multi-Platform E-IDE driver Revision: 6.32
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI slot 00:07.1
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:pio
hda: WDC WD205BA, ATA DISK drive
hdb: CD-ROM 50X L, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 40088160 sectors (20525 MB) w/2048KiB Cache, CHS=2495/255/63
hdb: ATAPI 50X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.12
Partition check:
hda: hda1 hda2 < hda5 hda6 hda7 >
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
eepro100.c:v1.09j-t 9/29/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <[email protected]> and others
PCI: Found IRQ 11 for device 00:14.0
PCI: Sharing IRQ 11 with 00:07.2
PCI: Sharing IRQ 11 with 00:07.3
eth0: Intel Corp. 82557 [Ethernet Pro 100], 00:02:B3:11:E5:92, IRQ 11.
Board assembly 721383-016, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0x04f4518b).
Linux agpgart interface v0.99 (c) Jeff Hartmann
agpgart: Maximum main memory to use for agp memory: 204M
agpgart: Detected Via Apollo Pro KT133 chipset
agpgart: AGP aperture is 32M @ 0xd4000000
Linux Kernel Card Services 3.1.22
options: [pci] [cardbus] [pm]
usb.c: registered new driver hub
uhci.c: USB Universal Host Controller Interface driver v1.1
PCI: Found IRQ 11 for device 00:07.2
PCI: Sharing IRQ 11 with 00:07.3
PCI: Sharing IRQ 11 with 00:14.0
uhci.c: USB UHCI at I/O 0xd400, IRQ 11
usb.c: new USB bus registered, assigned bus number 1
hub.c: USB hub found at /
hub.c: 2 ports detected
PCI: Found IRQ 11 for device 00:07.3
PCI: Sharing IRQ 11 with 00:07.2
PCI: Sharing IRQ 11 with 00:14.0
uhci.c: USB UHCI at I/O 0xd800, IRQ 11
usb.c: new USB bus registered, assigned bus number 2
hub.c: USB hub found at /
hub.c: 2 ports detected
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 32768)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
ds: no socket drivers loaded!
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 204k freed
Adding Swap: 530104k swap-space (priority -1)
NFS: NFSv3 not supported.
nfs warning: mount version older than kernel




- Davide



2002-01-18 19:42:15

by Davide Libenzi

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, 18 Jan 2002, Andre Hedrick wrote:

> On Fri, 18 Jan 2002, Davide Libenzi wrote:
>
> > On Fri, 18 Jan 2002, Jens Axboe wrote:
> >
> > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > On Fri, 18 Jan 2002, Anton Altaparmakov wrote:
> > > >
> > > > > Since the new IDE core from Andre is now solid as reported by various
> > > > > people on IRC, here is my local patch (stable for me) which you can apply
> > > > > to play with the shiny new IDE core (IDE core fix is same as
> > > > > ata-253p1-2.bz2 from Jens). (-:
> > > >
> > > > I would like to say the same. I worked with the fixed kernel
> > > > 2.5.3-pre1+ata-253p1-2 yesterday w/out problems. I rebootedt the machine
> > > > before leaving the office yesterday night and this morning it had a full
> > > > screen :
> > > >
> > > > hda: lost interrupt
> > > > hda: lost interrupt
> > > > hda: lost interrupt
> > > > hda: lost interrupt
> > > > hda: lost interrupt
> > >
> > > What mode? PIO and no multi mode, or?
> >
> >
> > This is what reports me 2.5.2 :
> >
> >
> > [root@blue1 davide]# cat /proc/ide/hda/settings
> > name value min max mode
> > ---- ----- --- --- ----
> > bios_cyl 2495 0 65535 rw
> > bios_head 255 0 255 rw
> > bios_sect 63 0 63 rw
> > breada_readahead 4 0 127 rw
> > bswap 0 0 1 r
> > current_speed 0 0 69 rw
> > failures 0 0 65535 rw
> > file_readahead 124 0 16384 rw
> > ide_scsi 0 0 1 rw
> > init_speed 0 0 69 rw
> > io_32bit 0 0 3 rw
> > keepsettings 0 0 1 rw
> > lun 0 0 7 rw
> > max_failures 1 0 65535 rw
> > multcount 8 0 8 rw
>
> There is a / 2 factor here, thus reality is 16,0,16

Guys, instead of requiring an -m8 to every user that is observing this
problem, isn't it better that you limit it inside the driver until things
gets fixed ?




- Davide


2002-01-21 05:49:54

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1



Since we are limited to 4k pages or 8 sectors transfers in multimode
for now, please set hdparm -m8 /dev/hdX.

http://www.kernel.org/pub/linux/kernel/people/hedrick/acb-io-2.5.3/ata-253p1-2+axboe1+fixes.patch.bz2

This should be a valid patch and it includes some extras for Jens.

On Fri, 18 Jan 2002, Davide Libenzi wrote:

> On Fri, 18 Jan 2002, Anton Altaparmakov wrote:
>
> > Since the new IDE core from Andre is now solid as reported by various
> > people on IRC, here is my local patch (stable for me) which you can apply
> > to play with the shiny new IDE core (IDE core fix is same as
> > ata-253p1-2.bz2 from Jens). (-:
>
> I would like to say the same. I worked with the fixed kernel
> 2.5.3-pre1+ata-253p1-2 yesterday w/out problems. I rebootedt the machine
> before leaving the office yesterday night and this morning it had a full
> screen :
>
> hda: lost interrupt
> hda: lost interrupt
> hda: lost interrupt
> hda: lost interrupt
> hda: lost interrupt
>
> I have to say that something like :
>
> All work and no play makes Jack a dull boy ...
> All work and no play makes Jack a dull boy ...
> All work and no play makes Jack a dull boy ...
>
> would have scared me more, but still i think there's some tuning to play
> with ...
>
>
>
>
> - Davide
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 05:49:54

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, 18 Jan 2002, Davide Libenzi wrote:

> > > multcount 8 0 8 rw
> >
> > There is a / 2 factor here, thus reality is 16,0,16
>
> Guys, instead of requiring an -m8 to every user that is observing this
> problem, isn't it better that you limit it inside the driver until things
> gets fixed ?

Yes, and is more fun than it sounds. The original driver alway checked
for max-capabilities and set accordingly, but things have changed.

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 05:49:52

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, 18 Jan 2002, Davide Libenzi wrote:

> On Fri, 18 Jan 2002, Andre Hedrick wrote:
> > > multcount 8 0 8 rw
> >
> > There is a / 2 factor here, thus reality is 16,0,16
>
> Guys, instead of requiring an -m8 to every user that is observing this
> problem, isn't it better that you limit it inside the driver until things
> gets fixed ?

Better yet is to # out CONFIG_IDEDISK_MULTI_MODE option in Config.in for
now.

Andre Hedrick
Linux Disk Certification Project Linux ATA Development


2002-01-21 05:49:49

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, 18 Jan 2002, Davide Libenzi wrote:

> On Fri, 18 Jan 2002, Jens Axboe wrote:
>
> > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > On Fri, 18 Jan 2002, Anton Altaparmakov wrote:
> > >
> > > > Since the new IDE core from Andre is now solid as reported by various
> > > > people on IRC, here is my local patch (stable for me) which you can apply
> > > > to play with the shiny new IDE core (IDE core fix is same as
> > > > ata-253p1-2.bz2 from Jens). (-:
> > >
> > > I would like to say the same. I worked with the fixed kernel
> > > 2.5.3-pre1+ata-253p1-2 yesterday w/out problems. I rebootedt the machine
> > > before leaving the office yesterday night and this morning it had a full
> > > screen :
> > >
> > > hda: lost interrupt
> > > hda: lost interrupt
> > > hda: lost interrupt
> > > hda: lost interrupt
> > > hda: lost interrupt
> >
> > What mode? PIO and no multi mode, or?
>
>
> This is what reports me 2.5.2 :
>
>
> [root@blue1 davide]# cat /proc/ide/hda/settings
> name value min max mode
> ---- ----- --- --- ----
> bios_cyl 2495 0 65535 rw
> bios_head 255 0 255 rw
> bios_sect 63 0 63 rw
> breada_readahead 4 0 127 rw
> bswap 0 0 1 r
> current_speed 0 0 69 rw
> failures 0 0 65535 rw
> file_readahead 124 0 16384 rw
> ide_scsi 0 0 1 rw
> init_speed 0 0 69 rw
> io_32bit 0 0 3 rw
> keepsettings 0 0 1 rw
> lun 0 0 7 rw
> max_failures 1 0 65535 rw
> multcount 8 0 8 rw

There is a / 2 factor here, thus reality is 16,0,16

> nice1 1 0 1 rw
> nowerr 0 0 1 rw
> number 0 0 3 rw
> pio_mode write-only 0 255 w
> slow 0 0 1 rw
> unmaskirq 0 0 1 rw
> using_dma 0 0 1 rw
>
>
>
>
>
> Linux version 2.5.2-mqo1 ([email protected]) (gcc version 3.0.3) #12 Wed Jan 16 09:49:54 PST 2002
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 0000000010000000 (usable)
> BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
> On node 0 totalpages: 65536
> zone(0): 4096 pages.
> zone(1): 61440 pages.
> zone(2): 0 pages.
> Kernel command line: auto BOOT_IMAGE=2.5.2-mqo1 ro root=305 BOOT_FILE=/boot/vmlinuz-2.5.2-mqo1
> Initializing CPU#0
> Detected 999.554 MHz processor.
> Console: colour VGA+ 80x25
> Calibrating delay loop... 1992.29 BogoMIPS
> Memory: 255896k/262144k available (1229k kernel code, 5860k reserved, 341k data, 204k init, 0k highmem)
> Dentry-cache hash table entries: 32768 (order: 6, 262144 bytes)
> Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
> Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
> Buffer-cache hash table entries: 16384 (order: 4, 65536 bytes)
> Page-cache hash table entries: 65536 (order: 6, 262144 bytes)
> CPU: Before vendor init, caps: 0183f9ff c1c7f9ff 00000000, vendor = 2
> CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> CPU: L2 Cache: 256K (64 bytes/line)
> CPU: After vendor init, caps: 0183f9ff c1c7f9ff 00000000 00000000
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#0.
> CPU: After generic, caps: 0183f9ff c1c7f9ff 00000000 00000000
> CPU: Common caps: 0183f9ff c1c7f9ff 00000000 00000000
> CPU: AMD Athlon(tm) Processor stepping 02
> Enabling fast FPU save and restore... done.
> Checking 'hlt' instruction... OK.
> POSIX conformance testing by UNIFIX
> mtrr: v1.40 (20010327) Richard Gooch ([email protected])
> mtrr: detected mtrr type: Intel
> PCI: PCI BIOS revision 2.10 entry at 0xfb350, last bus=1
> PCI: Using configuration type 1
> PCI: Probing PCI hardware
> PCI: Using IRQ router VIA [1106/0686] at 00:07.0
> isapnp: Scanning for PnP cards...
> isapnp: No Plug & Play device found
> Linux NET4.0 for Linux 2.4
> Based upon Swansea University Computer Society NET3.039
> Starting kswapd
> BIO: pool of 256 setup, 14Kb (56 bytes/bio)
> biovec: init pool 0, 1 entries, 12 bytes
> biovec: init pool 1, 4 entries, 48 bytes
> biovec: init pool 2, 16 entries, 192 bytes
> biovec: init pool 3, 64 entries, 768 bytes
> biovec: init pool 4, 128 entries, 1536 bytes
> biovec: init pool 5, 256 entries, 3072 bytes
> pty: 256 Unix98 ptys configured
> Serial driver version 5.05c (2001-07-08) with MANY_PORTS SHARE_IRQ SERIAL_PCI enabled
> ttyS00 at 0x03f8 (irq = 4) is a 16550A
> ttyS01 at 0x02f8 (irq = 3) is a 16550A
> block: 256 slots per queue, batch=32
> Uniform Multi-Platform E-IDE driver Revision: 6.32
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> VP_IDE: IDE controller on PCI slot 00:07.1
> VP_IDE: chipset revision 16
> VP_IDE: not 100% native mode: will probe irqs later
> ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:DMA
> ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:pio
> hda: WDC WD205BA, ATA DISK drive
> hdb: CD-ROM 50X L, ATAPI CD/DVD-ROM drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> hda: 40088160 sectors (20525 MB) w/2048KiB Cache, CHS=2495/255/63
> hdb: ATAPI 50X CD-ROM drive, 128kB Cache
> Uniform CD-ROM driver Revision: 3.12
> Partition check:
> hda: hda1 hda2 < hda5 hda6 hda7 >
> Floppy drive(s): fd0 is 1.44M
> FDC 0 is a post-1991 82077
> eepro100.c:v1.09j-t 9/29/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
> eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <[email protected]> and others
> PCI: Found IRQ 11 for device 00:14.0
> PCI: Sharing IRQ 11 with 00:07.2
> PCI: Sharing IRQ 11 with 00:07.3
> eth0: Intel Corp. 82557 [Ethernet Pro 100], 00:02:B3:11:E5:92, IRQ 11.
> Board assembly 721383-016, Physical connectors present: RJ45
> Primary interface chip i82555 PHY #1.
> General self-test: passed.
> Serial sub-system self-test: passed.
> Internal registers self-test: passed.
> ROM checksum self-test: passed (0x04f4518b).
> Linux agpgart interface v0.99 (c) Jeff Hartmann
> agpgart: Maximum main memory to use for agp memory: 204M
> agpgart: Detected Via Apollo Pro KT133 chipset
> agpgart: AGP aperture is 32M @ 0xd4000000
> Linux Kernel Card Services 3.1.22
> options: [pci] [cardbus] [pm]
> usb.c: registered new driver hub
> uhci.c: USB Universal Host Controller Interface driver v1.1
> PCI: Found IRQ 11 for device 00:07.2
> PCI: Sharing IRQ 11 with 00:07.3
> PCI: Sharing IRQ 11 with 00:14.0
> uhci.c: USB UHCI at I/O 0xd400, IRQ 11
> usb.c: new USB bus registered, assigned bus number 1
> hub.c: USB hub found at /
> hub.c: 2 ports detected
> PCI: Found IRQ 11 for device 00:07.3
> PCI: Sharing IRQ 11 with 00:07.2
> PCI: Sharing IRQ 11 with 00:14.0
> uhci.c: USB UHCI at I/O 0xd800, IRQ 11
> usb.c: new USB bus registered, assigned bus number 2
> hub.c: USB hub found at /
> hub.c: 2 ports detected
> NET4: Linux TCP/IP 1.0 for NET4.0
> IP Protocols: ICMP, UDP, TCP, IGMP
> IP: routing cache hash table of 2048 buckets, 16Kbytes
> TCP: Hash tables configured (established 16384 bind 32768)
> NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
> ds: no socket drivers loaded!
> VFS: Mounted root (ext2 filesystem) readonly.
> Freeing unused kernel memory: 204k freed
> Adding Swap: 530104k swap-space (priority -1)
> NFS: NFSv3 not supported.
> nfs warning: mount version older than kernel
>
>
>
>
> - Davide
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-19 11:41:00

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Fri, Jan 18 2002, Davide Libenzi wrote:
> Guys, instead of requiring an -m8 to every user that is observing this
> problem, isn't it better that you limit it inside the driver until things
> gets fixed ?

There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
multi mode value.

--
Jens Axboe

2002-01-19 15:45:54

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sat, Jan 19 2002, Andre Hedrick wrote:
> On Sat, 19 Jan 2002, Jens Axboe wrote:
>
> > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > Guys, instead of requiring an -m8 to every user that is observing this
> > > problem, isn't it better that you limit it inside the driver until things
> > > gets fixed ?
> >
> > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > multi mode value.
> >
> > --
> > Jens Axboe
> >
>
> And that will generate the [lost interrupt], and I have it fixed at all
> levels too now.

How so? I don't see the problem.

--
Jens Axboe

2002-01-19 21:39:47

by Davide Libenzi

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sat, 19 Jan 2002, Andre Hedrick wrote:

> On Sat, 19 Jan 2002, Jens Axboe wrote:
>
> > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > >
> > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > gets fixed ?
> > > >
> > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > multi mode value.
> > > >
> > > > --
> > > > Jens Axboe
> > > >
> > >
> > > And that will generate the [lost interrupt], and I have it fixed at all
> > > levels too now.
> >
> > How so? I don't see the problem.
>
> Unlike ATAPI which will generally send you more data than requested on
> itw own, ATA devices do not like enjoy or play the game. Additionally the
> current code asks for 16 sectors, but we do not do the request copy
> anymore, and this mean for every 4k of paging we are soliciting for 8k.
> We only read out 4k thus the device has the the next 4k we may be wanting
> ready. Look at it as a dirty prefetch, but eventally the drive is going
> to want to go south, thus [lost interrupt]
>
> Basically as the Block maintainer, you pointed out I am restricted to 4k
> chunking in PIO. You decided, in the interest of the block glue layer
> into the driver, to force early end request per Linus's requirements to
> return back every 4k completed to block regardless of the size of the
> total data requested.
>
> For the above two condition to be properly satisfied, I have to adjust
> and apply one driver policy make the driver behave and give the desired
> results. We should note this will conform with future IDEMA proposals
> being submitted to the T committees.

That was it. By limiting the sector count request i was able to fix it.
Do you've any permanent/not-hackish fix for this ?




- Davide


2002-01-20 01:57:30

by Davide Libenzi

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sat, 19 Jan 2002, Andre Hedrick wrote:

>
> Yes,
>
> I have a patch against 2.5.3-pre1 clean, and is up on kernel.org's upload
> site, but k.o is down. It can also be gotten off
> http://www.linuxdiskcert.org/
> It is a tiny 37k patch and bzip2'd to 8k.

By applying the patch posted by Anton ( patch-2.5.3-pre1-aia2 ) the
problem persist. The machine seems usable but time to time the timer hit
and lost interrupt shows up. I'm going to try your patch now.




- Davide


2002-01-20 10:49:22

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sat, Jan 19 2002, Andre Hedrick wrote:
> > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > >
> > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > gets fixed ?
> > > >
> > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > multi mode value.
> > > >
> > > > --
> > > > Jens Axboe
> > > >
> > >
> > > And that will generate the [lost interrupt], and I have it fixed at all
> > > levels too now.
> >
> > How so? I don't see the problem.
>
> Unlike ATAPI which will generally send you more data than requested on
> itw own, ATA devices do not like enjoy or play the game. Additionally the

Unrelated ATAPI fodder :-)

> current code asks for 16 sectors, but we do not do the request copy
> anymore, and this mean for every 4k of paging we are soliciting for 8k.

The (now) missing copy is unrelated.

> We only read out 4k thus the device has the the next 4k we may be wanting
> ready. Look at it as a dirty prefetch, but eventally the drive is going
> to want to go south, thus [lost interrupt]

Even if the drive is programmed for 16 sectors in multi mode, it still
must honor lower transfer sizes. The fix I did was not to limit this,
but rather to only setup transfers for the amount of sectors in the
first chunk. This is indeed necessary now that we do not have a copy of
the request to fool around with.

> Basically as the Block maintainer, you pointed out I am restricted to 4k
> chunking in PIO. You decided, in the interest of the block glue layer
> into the driver, to force early end request per Linus's requirements to
> return back every 4k completed to block regardless of the size of the
> total data requested.

Correct. The solution I did (which was one of the two I suggested) is
still the cleanest, IMHO.

> For the above two condition to be properly satisfied, I have to adjust
> and apply one driver policy make the driver behave and give the desired
> results. We should note this will conform with future IDEMA proposals
> being submitted to the T committees.

I still don't see a description of why this would cause a lost
interrupt. What is the flaw in my theory and/or code?

--
Jens Axboe

2002-01-20 18:49:56

by Davide Libenzi

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, 20 Jan 2002, Jens Axboe wrote:

> On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > > >
> > > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > > gets fixed ?
> > > > >
> > > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > > multi mode value.
> > > > >
> > > > > --
> > > > > Jens Axboe
> > > > >
> > > >
> > > > And that will generate the [lost interrupt], and I have it fixed at all
> > > > levels too now.
> > >
> > > How so? I don't see the problem.
> >
> > Unlike ATAPI which will generally send you more data than requested on
> > itw own, ATA devices do not like enjoy or play the game. Additionally the
>
> Unrelated ATAPI fodder :-)
>
> > current code asks for 16 sectors, but we do not do the request copy
> > anymore, and this mean for every 4k of paging we are soliciting for 8k.
>
> The (now) missing copy is unrelated.
>
> > We only read out 4k thus the device has the the next 4k we may be wanting
> > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > to want to go south, thus [lost interrupt]
>
> Even if the drive is programmed for 16 sectors in multi mode, it still
> must honor lower transfer sizes. The fix I did was not to limit this,
> but rather to only setup transfers for the amount of sectors in the
> first chunk. This is indeed necessary now that we do not have a copy of
> the request to fool around with.
>
> > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > chunking in PIO. You decided, in the interest of the block glue layer
> > into the driver, to force early end request per Linus's requirements to
> > return back every 4k completed to block regardless of the size of the
> > total data requested.
>
> Correct. The solution I did (which was one of the two I suggested) is
> still the cleanest, IMHO.
>
> > For the above two condition to be properly satisfied, I have to adjust
> > and apply one driver policy make the driver behave and give the desired
> > results. We should note this will conform with future IDEMA proposals
> > being submitted to the T committees.
>
> I still don't see a description of why this would cause a lost
> interrupt. What is the flaw in my theory and/or code?

Guys, i'm sorry to report you bad news but i still get 'lost interrupt'
with all applied patches ( Anton and Andre ).




- Davide


2002-01-21 05:45:48

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

vger eats another one :-(
---------- Forwarded message ----------
Date: Sun, 20 Jan 2002 16:12:36 -0800 (PST)
From: Andre Hedrick <[email protected]>
To: Davide Libenzi <[email protected]>
Cc: Jens Axboe <[email protected]>, Anton Altaparmakov <[email protected]>,
Linus Torvalds <[email protected]>,
lkml <[email protected]>
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, 20 Jan 2002, Davide Libenzi wrote:

> On Sun, 20 Jan 2002, Jens Axboe wrote:
>
> > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > > > >
> > > > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > > > gets fixed ?
> > > > > >
> > > > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > > > multi mode value.
> > > > > >
> > > > > > --
> > > > > > Jens Axboe
> > > > > >
> > > > >
> > > > > And that will generate the [lost interrupt], and I have it fixed at all
> > > > > levels too now.
> > > >
> > > > How so? I don't see the problem.
> > >
> > > Unlike ATAPI which will generally send you more data than requested on
> > > itw own, ATA devices do not like enjoy or play the game. Additionally the
> >
> > Unrelated ATAPI fodder :-)
> >
> > > current code asks for 16 sectors, but we do not do the request copy
> > > anymore, and this mean for every 4k of paging we are soliciting for 8k.
> >
> > The (now) missing copy is unrelated.
> >
> > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > to want to go south, thus [lost interrupt]
> >
> > Even if the drive is programmed for 16 sectors in multi mode, it still
> > must honor lower transfer sizes. The fix I did was not to limit this,
> > but rather to only setup transfers for the amount of sectors in the
> > first chunk. This is indeed necessary now that we do not have a copy of
> > the request to fool around with.

Listen and for just a second okay.

Since the set multimode command is similar to the set transfer rate, if
you program the drive to run at U100 but the host can feed only U33 you
have problems. Much of this simple arguement is the same answer for
multimode.

Same thing here but a variation, of the operations,

> > > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > > chunking in PIO. You decided, in the interest of the block glue layer
> > > into the driver, to force early end request per Linus's requirements to
> > > return back every 4k completed to block regardless of the size of the
> > > total data requested.
> >
> > Correct. The solution I did (which was one of the two I suggested) is
> > still the cleanest, IMHO.

The "cleanest" != techincal correctness, it may be the cleanest for
current infrastructure of BLOCK, but by no means is it techincally
correct. It is more of the darwinism of hammer a square object into a
round hole.

> > > For the above two condition to be properly satisfied, I have to adjust
> > > and apply one driver policy make the driver behave and give the desired
> > > results. We should note this will conform with future IDEMA proposals
> > > being submitted to the T committees.
> >
> > I still don't see a description of why this would cause a lost
> > interrupt. What is the flaw in my theory and/or code?

Because you think of the OS as defining the guidelines and the reality is
the hardware defines the rules and the OS has to work around. I am happy
to allow you to continue to modify the ISR behavor and the command
behavor, and I am willing to be proven wrong. When the problem does not
go away, I will request we return to the rules of the hardware. And this
may require a MEMPOOL just like SCSI.

> Guys, i'm sorry to report you bad news but i still get 'lost interrupt'
> with all applied patches ( Anton and Andre ).

Regards,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development



2002-01-21 05:45:48

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

vger eats a second one :-(
---------- Forwarded message ----------
Date: Sun, 20 Jan 2002 17:48:29 -0800 (PST)
From: Andre Hedrick <[email protected]>
To: Jens Axboe <[email protected]>
Cc: Davide Libenzi <[email protected]>,
Anton Altaparmakov <[email protected]>,
Linus Torvalds <[email protected]>,
lkml <[email protected]>
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, 20 Jan 2002, Jens Axboe wrote:

> On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > > >
> > > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > > gets fixed ?
> > > > >
> > > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > > multi mode value.
> > > > >
> > > > > --
> > > > > Jens Axboe
> > > > >
> > > >
> > > > And that will generate the [lost interrupt], and I have it fixed at all
> > > > levels too now.
> > >
> > > How so? I don't see the problem.
> >
> > Unlike ATAPI which will generally send you more data than requested on
> > itw own, ATA devices do not like enjoy or play the game. Additionally the
>
> Unrelated ATAPI fodder :-)
>
> > current code asks for 16 sectors, but we do not do the request copy
> > anymore, and this mean for every 4k of paging we are soliciting for 8k.
>
> The (now) missing copy is unrelated.
>
> > We only read out 4k thus the device has the the next 4k we may be wanting
> > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > to want to go south, thus [lost interrupt]
>
> Even if the drive is programmed for 16 sectors in multi mode, it still
> must honor lower transfer sizes. The fix I did was not to limit this,
> but rather to only setup transfers for the amount of sectors in the
> first chunk. This is indeed necessary now that we do not have a copy of
> the request to fool around with.
>
> > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > chunking in PIO. You decided, in the interest of the block glue layer
> > into the driver, to force early end request per Linus's requirements to
> > return back every 4k completed to block regardless of the size of the
> > total data requested.
>
> Correct. The solution I did (which was one of the two I suggested) is
> still the cleanest, IMHO.
>
> > For the above two condition to be properly satisfied, I have to adjust
> > and apply one driver policy make the driver behave and give the desired
> > results. We should note this will conform with future IDEMA proposals
> > being submitted to the T committees.
>
> I still don't see a description of why this would cause a lost
> interrupt. What is the flaw in my theory and/or code?

We issue a setmultimode command and the driver defaults to maximum or 16
sectors in most cases. This means the drive is expecting 16 sectors, and
your design is to issue only 8 sectors or less. The issuing of 8 sectors
or less in the sector_count, while the device is expecting 16 is a setup
for problems.

The effective operations your changes have created without addressing all
the variables is to terminate the command in process. Therefore, the
decision made by you was to restrict the transfers to be process to the
count in rq->current_nr_sectors. There is no bounds checking based on the
command executed.

*****************************
The questions to ask "How would the host terminate a command in progress,
since BSY=1 (or DRQ=1) at this point? Is that done via a DEVICE_RESET or
SRST write?"

Let me quote "Hale Landis" (one of the Fathers of ATA).

On an ATA device a H/W Reset or S/W reset is required to terminate a
command in progress. Writing to the Command register while BSY=1 or
BSY=0:DRQ=1 is illegal.

On an ATAPI device a H/W Reset or S/W reset or DEVICE RESET is
required to terminate a command in progress (and that includes PACKET
commands). Writing a value other than 08H (DEVICE RESET) to the
Command register while BSY=1 or BSY=0:DRQ=1 is illegal.
*****************************

Now what you have created is an illegal operation.

If the device is expecting a fixed amount of data and you stop it in
mid-stream and do not reset the device and issue a second command for
transfer, expect the device to go south.

If we are going to operate in this mode of brokeness, then let me finish
the change to of the command structure to make the driver work in that
environment.

Regards,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development



2002-01-21 05:47:29

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, 20 Jan 2002, Davide Libenzi wrote:

> On Sun, 20 Jan 2002, Jens Axboe wrote:
>
> > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > > > >
> > > > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > > > gets fixed ?
> > > > > >
> > > > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > > > multi mode value.
> > > > > >
> > > > > > --
> > > > > > Jens Axboe
> > > > > >
> > > > >
> > > > > And that will generate the [lost interrupt], and I have it fixed at all
> > > > > levels too now.
> > > >
> > > > How so? I don't see the problem.
> > >
> > > Unlike ATAPI which will generally send you more data than requested on
> > > itw own, ATA devices do not like enjoy or play the game. Additionally the
> >
> > Unrelated ATAPI fodder :-)
> >
> > > current code asks for 16 sectors, but we do not do the request copy
> > > anymore, and this mean for every 4k of paging we are soliciting for 8k.
> >
> > The (now) missing copy is unrelated.
> >
> > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > to want to go south, thus [lost interrupt]
> >
> > Even if the drive is programmed for 16 sectors in multi mode, it still
> > must honor lower transfer sizes. The fix I did was not to limit this,
> > but rather to only setup transfers for the amount of sectors in the
> > first chunk. This is indeed necessary now that we do not have a copy of
> > the request to fool around with.

Listen and for just a second okay.

Since the set multimode command is similar to the set transfer rate, if
you program the drive to run at U100 but the host can feed only U33 you
have problems. Much of this simple arguement is the same answer for
multimode.

Same thing here but a variation, of the operations,

> > > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > > chunking in PIO. You decided, in the interest of the block glue layer
> > > into the driver, to force early end request per Linus's requirements to
> > > return back every 4k completed to block regardless of the size of the
> > > total data requested.
> >
> > Correct. The solution I did (which was one of the two I suggested) is
> > still the cleanest, IMHO.

The "cleanest" != techincal correctness, it may be the cleanest for
current infrastructure of BLOCK, but by no means is it techincally
correct. It is more of the darwinism of hammer a square object into a
round hole.

> > > For the above two condition to be properly satisfied, I have to adjust
> > > and apply one driver policy make the driver behave and give the desired
> > > results. We should note this will conform with future IDEMA proposals
> > > being submitted to the T committees.
> >
> > I still don't see a description of why this would cause a lost
> > interrupt. What is the flaw in my theory and/or code?

Because you think of the OS as defining the guidelines and the reality is
the hardware defines the rules and the OS has to work around. I am happy
to allow you to continue to modify the ISR behavor and the command
behavor, and I am willing to be proven wrong. When the problem does not
go away, I will request we return to the rules of the hardware. And this
may require a MEMPOOL just like SCSI.

> Guys, i'm sorry to report you bad news but i still get 'lost interrupt'
> with all applied patches ( Anton and Andre ).

Regards,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development


2002-01-21 05:47:30

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, 20 Jan 2002, Jens Axboe wrote:

> On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > > >
> > > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > > gets fixed ?
> > > > >
> > > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > > multi mode value.
> > > > >
> > > > > --
> > > > > Jens Axboe
> > > > >
> > > >
> > > > And that will generate the [lost interrupt], and I have it fixed at all
> > > > levels too now.
> > >
> > > How so? I don't see the problem.
> >
> > Unlike ATAPI which will generally send you more data than requested on
> > itw own, ATA devices do not like enjoy or play the game. Additionally the
>
> Unrelated ATAPI fodder :-)
>
> > current code asks for 16 sectors, but we do not do the request copy
> > anymore, and this mean for every 4k of paging we are soliciting for 8k.
>
> The (now) missing copy is unrelated.
>
> > We only read out 4k thus the device has the the next 4k we may be wanting
> > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > to want to go south, thus [lost interrupt]
>
> Even if the drive is programmed for 16 sectors in multi mode, it still
> must honor lower transfer sizes. The fix I did was not to limit this,
> but rather to only setup transfers for the amount of sectors in the
> first chunk. This is indeed necessary now that we do not have a copy of
> the request to fool around with.
>
> > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > chunking in PIO. You decided, in the interest of the block glue layer
> > into the driver, to force early end request per Linus's requirements to
> > return back every 4k completed to block regardless of the size of the
> > total data requested.
>
> Correct. The solution I did (which was one of the two I suggested) is
> still the cleanest, IMHO.
>
> > For the above two condition to be properly satisfied, I have to adjust
> > and apply one driver policy make the driver behave and give the desired
> > results. We should note this will conform with future IDEMA proposals
> > being submitted to the T committees.
>
> I still don't see a description of why this would cause a lost
> interrupt. What is the flaw in my theory and/or code?

We issue a setmultimode command and the driver defaults to maximum or 16
sectors in most cases. This means the drive is expecting 16 sectors, and
your design is to issue only 8 sectors or less. The issuing of 8 sectors
or less in the sector_count, while the device is expecting 16 is a setup
for problems.

The effective operations your changes have created without addressing all
the variables is to terminate the command in process. Therefore, the
decision made by you was to restrict the transfers to be process to the
count in rq->current_nr_sectors. There is no bounds checking based on the
command executed.

*****************************
The questions to ask "How would the host terminate a command in progress,
since BSY=1 (or DRQ=1) at this point? Is that done via a DEVICE_RESET or
SRST write?"

Let me quote "Hale Landis" (one of the Fathers of ATA).

On an ATA device a H/W Reset or S/W reset is required to terminate a
command in progress. Writing to the Command register while BSY=1 or
BSY=0:DRQ=1 is illegal.

On an ATAPI device a H/W Reset or S/W reset or DEVICE RESET is
required to terminate a command in progress (and that includes PACKET
commands). Writing a value other than 08H (DEVICE RESET) to the
Command register while BSY=1 or BSY=0:DRQ=1 is illegal.
*****************************

Now what you have created is an illegal operation.

If the device is expecting a fixed amount of data and you stop it in
mid-stream and do not reset the device and issue a second command for
transfer, expect the device to go south.

If we are going to operate in this mode of brokeness, then let me finish
the change to of the command structure to make the driver work in that
environment.

Regards,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development


2002-01-21 05:49:53

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sat, 19 Jan 2002, Jens Axboe wrote:

> On Fri, Jan 18 2002, Davide Libenzi wrote:
> > Guys, instead of requiring an -m8 to every user that is observing this
> > problem, isn't it better that you limit it inside the driver until things
> > gets fixed ?
>
> There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> multi mode value.
>
> --
> Jens Axboe
>

And that will generate the [lost interrupt], and I have it fixed at all
levels too now.


Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 05:49:51

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sat, 19 Jan 2002, Jens Axboe wrote:

> On Sat, Jan 19 2002, Andre Hedrick wrote:
> > On Sat, 19 Jan 2002, Jens Axboe wrote:
> >
> > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > problem, isn't it better that you limit it inside the driver until things
> > > > gets fixed ?
> > >
> > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > multi mode value.
> > >
> > > --
> > > Jens Axboe
> > >
> >
> > And that will generate the [lost interrupt], and I have it fixed at all
> > levels too now.
>
> How so? I don't see the problem.

Unlike ATAPI which will generally send you more data than requested on
itw own, ATA devices do not like enjoy or play the game. Additionally the
current code asks for 16 sectors, but we do not do the request copy
anymore, and this mean for every 4k of paging we are soliciting for 8k.
We only read out 4k thus the device has the the next 4k we may be wanting
ready. Look at it as a dirty prefetch, but eventally the drive is going
to want to go south, thus [lost interrupt]

Basically as the Block maintainer, you pointed out I am restricted to 4k
chunking in PIO. You decided, in the interest of the block glue layer
into the driver, to force early end request per Linus's requirements to
return back every 4k completed to block regardless of the size of the
total data requested.

For the above two condition to be properly satisfied, I have to adjust
and apply one driver policy make the driver behave and give the desired
results. We should note this will conform with future IDEMA proposals
being submitted to the T committees.

Regards,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 05:49:50

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1


Yes,

I have a patch against 2.5.3-pre1 clean, and is up on kernel.org's upload
site, but k.o is down. It can also be gotten off
http://www.linuxdiskcert.org/
It is a tiny 37k patch and bzip2'd to 8k.


On Sat, 19 Jan 2002, Davide Libenzi wrote:

> On Sat, 19 Jan 2002, Andre Hedrick wrote:
>
> > On Sat, 19 Jan 2002, Jens Axboe wrote:
> >
> > > On Sat, Jan 19 2002, Andre Hedrick wrote:
> > > > On Sat, 19 Jan 2002, Jens Axboe wrote:
> > > >
> > > > > On Fri, Jan 18 2002, Davide Libenzi wrote:
> > > > > > Guys, instead of requiring an -m8 to every user that is observing this
> > > > > > problem, isn't it better that you limit it inside the driver until things
> > > > > > gets fixed ?
> > > > >
> > > > > There is no -m8 limit, 2.5.3-pre1 + ata253p1-2 patch handles any set
> > > > > multi mode value.
> > > > >
> > > > > --
> > > > > Jens Axboe
> > > > >
> > > >
> > > > And that will generate the [lost interrupt], and I have it fixed at all
> > > > levels too now.
> > >
> > > How so? I don't see the problem.
> >
> > Unlike ATAPI which will generally send you more data than requested on
> > itw own, ATA devices do not like enjoy or play the game. Additionally the
> > current code asks for 16 sectors, but we do not do the request copy
> > anymore, and this mean for every 4k of paging we are soliciting for 8k.
> > We only read out 4k thus the device has the the next 4k we may be wanting
> > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > to want to go south, thus [lost interrupt]
> >
> > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > chunking in PIO. You decided, in the interest of the block glue layer
> > into the driver, to force early end request per Linus's requirements to
> > return back every 4k completed to block regardless of the size of the
> > total data requested.
> >
> > For the above two condition to be properly satisfied, I have to adjust
> > and apply one driver policy make the driver behave and give the desired
> > results. We should note this will conform with future IDEMA proposals
> > being submitted to the T committees.
>
> That was it. By limiting the sector count request i was able to fix it.
> Do you've any permanent/not-hackish fix for this ?
>
>
>
>
> - Davide
>
>

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 06:19:50

by Matti Aarnio

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, Jan 20, 2002 at 08:40:37PM -0800, Andre Hedrick wrote:
...
> Received: from astound-64-85-224-253.ca.astound.net ([64.85.224.253]:43535
> "EHLO master.linux-ide.org") by vger.kernel.org with ESMTP
> id <S289014AbSAUFpb>; Mon, 21 Jan 2002 00:45:31 -0500
> Received: from localhost (andre@localhost)
> by master.linux-ide.org (8.9.3/8.9.3) with ESMTP id UAA13058
> for <[email protected]>; Sun, 20 Jan 2002 20:40:37 -0800
> Date: Sun, 20 Jan 2002 20:40:37 -0800 (PST)
> From: Andre Hedrick <[email protected]>
> To: [email protected]
> Subject: Re: Linux 2.5.3-pre1-aia1
>
> vger eats a second one :-(

Don't blame vger for the constipation of your own machine.
Of course the network in between your machine, and vger may
have been dysfunctional for a while causing that show delivery.

Hint: "mailq -v" is your friend.

/Matti Aarnio

2002-01-21 07:37:52

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, Jan 20 2002, Andre Hedrick wrote:
> > Even if the drive is programmed for 16 sectors in multi mode, it still
> > must honor lower transfer sizes. The fix I did was not to limit this,
> > but rather to only setup transfers for the amount of sectors in the
> > first chunk. This is indeed necessary now that we do not have a copy of
> > the request to fool around with.
> >
> > > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > > chunking in PIO. You decided, in the interest of the block glue layer
> > > into the driver, to force early end request per Linus's requirements to
> > > return back every 4k completed to block regardless of the size of the
> > > total data requested.
> >
> > Correct. The solution I did (which was one of the two I suggested) is
> > still the cleanest, IMHO.
> >
> > > For the above two condition to be properly satisfied, I have to adjust
> > > and apply one driver policy make the driver behave and give the desired
> > > results. We should note this will conform with future IDEMA proposals
> > > being submitted to the T committees.
> >
> > I still don't see a description of why this would cause a lost
> > interrupt. What is the flaw in my theory and/or code?
>
> We issue a setmultimode command and the driver defaults to maximum or 16
> sectors in most cases. This means the drive is expecting 16 sectors, and

Correct so far.

> your design is to issue only 8 sectors or less. The issuing of 8 sectors
> or less in the sector_count, while the device is expecting 16 is a setup
> for problems.

No it's not. By your standards, that would mean that if the device is
setup for 16 sector multi mode, then I could never ever issue requests
less than that (without doing some crap 'toss away extra data' stuff).
How else would you handle, eg, 2 sector requests with multi mode set?

> The effective operations your changes have created without addressing all
> the variables is to terminate the command in process. Therefore, the
> decision made by you was to restrict the transfers to be process to the
> count in rq->current_nr_sectors. There is no bounds checking based on the
> command executed.

I'm not stopping a request in progress. I told the drive that the
request is current_nr_sectors big, so once it finishes transferring
current_nr_sectors sectors it truly thinks it's really done with that
request. And it is. However, I'm leaving the request on the queue (or,
really, ide_end_request is not taking it off because
end_that_request_first is not indicating it's complete). So I'm simply
starting from scratch with the remaining data. See?

> *****************************
> The questions to ask "How would the host terminate a command in progress,
> since BSY=1 (or DRQ=1) at this point? Is that done via a DEVICE_RESET or
> SRST write?"

[snip]

Moot, there's no premature termination going on.

--
Jens Axboe

2002-01-21 07:53:10

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Jens Axboe wrote:

> On Sun, Jan 20 2002, Andre Hedrick wrote:
> > > Even if the drive is programmed for 16 sectors in multi mode, it still
> > > must honor lower transfer sizes. The fix I did was not to limit this,
> > > but rather to only setup transfers for the amount of sectors in the
> > > first chunk. This is indeed necessary now that we do not have a copy of
> > > the request to fool around with.
> > >
> > > > Basically as the Block maintainer, you pointed out I am restricted to 4k
> > > > chunking in PIO. You decided, in the interest of the block glue layer
> > > > into the driver, to force early end request per Linus's requirements to
> > > > return back every 4k completed to block regardless of the size of the
> > > > total data requested.
> > >
> > > Correct. The solution I did (which was one of the two I suggested) is
> > > still the cleanest, IMHO.
> > >
> > > > For the above two condition to be properly satisfied, I have to adjust
> > > > and apply one driver policy make the driver behave and give the desired
> > > > results. We should note this will conform with future IDEMA proposals
> > > > being submitted to the T committees.
> > >
> > > I still don't see a description of why this would cause a lost
> > > interrupt. What is the flaw in my theory and/or code?
> >
> > We issue a setmultimode command and the driver defaults to maximum or 16
> > sectors in most cases. This means the drive is expecting 16 sectors, and
>
> Correct so far.
>
> > your design is to issue only 8 sectors or less. The issuing of 8 sectors
> > or less in the sector_count, while the device is expecting 16 is a setup
> > for problems.
>
> No it's not. By your standards, that would mean that if the device is
> setup for 16 sector multi mode, then I could never ever issue requests
> less than that (without doing some crap 'toss away extra data' stuff).
> How else would you handle, eg, 2 sector requests with multi mode set?

Change the opcode in the command block to single sector, if
rq->current_nr_sectors != drive->multcount.

> > The effective operations your changes have created without addressing all
> > the variables is to terminate the command in process. Therefore, the
> > decision made by you was to restrict the transfers to be process to the
> > count in rq->current_nr_sectors. There is no bounds checking based on the
> > command executed.
>
> I'm not stopping a request in progress. I told the drive that the
> request is current_nr_sectors big, so once it finishes transferring
> current_nr_sectors sectors it truly thinks it's really done with that
> request. And it is. However, I'm leaving the request on the queue (or,
> really, ide_end_request is not taking it off because
> end_that_request_first is not indicating it's complete). So I'm simply
> starting from scratch with the remaining data. See?

I know what you are doing, and I am trying to mate the requirement to use
the hardware to what you are sending down. The question you need to
answer is issuing a request for multi-sector transfers less than what the
device is expecting, sane and correct. If you tell me it is correct,
please show me where I read something wrong in the specification.

> > *****************************
> > The questions to ask "How would the host terminate a command in progress,
> > since BSY=1 (or DRQ=1) at this point? Is that done via a DEVICE_RESET or
> > SRST write?"
>
> [snip]
>
> Moot, there's no premature termination going on.

>From the OS/HOST side you are 100% correct.
>From the device side, do you know that for a fact?
Please read the difference in the two state-machine diagrams, the have the
same name phasing, but each describes which end of the cable you are on
and the expected behavors.

Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 08:02:29

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, Jan 20 2002, Andre Hedrick wrote:
> > No it's not. By your standards, that would mean that if the device is
> > setup for 16 sector multi mode, then I could never ever issue requests
> > less than that (without doing some crap 'toss away extra data' stuff).
> > How else would you handle, eg, 2 sector requests with multi mode set?
>
> Change the opcode in the command block to single sector, if
> rq->current_nr_sectors != drive->multcount.

That crossed my mind too, however that's not what we've been doing in
the past and multi mode has worked fine.

> > > The effective operations your changes have created without addressing all
> > > the variables is to terminate the command in process. Therefore, the
> > > decision made by you was to restrict the transfers to be process to the
> > > count in rq->current_nr_sectors. There is no bounds checking based on the
> > > command executed.
> >
> > I'm not stopping a request in progress. I told the drive that the
> > request is current_nr_sectors big, so once it finishes transferring
> > current_nr_sectors sectors it truly thinks it's really done with that
> > request. And it is. However, I'm leaving the request on the queue (or,
> > really, ide_end_request is not taking it off because
> > end_that_request_first is not indicating it's complete). So I'm simply
> > starting from scratch with the remaining data. See?
>
> I know what you are doing, and I am trying to mate the requirement to use

Yes

> the hardware to what you are sending down. The question you need to
> answer is issuing a request for multi-sector transfers less than what the
> device is expecting, sane and correct. If you tell me it is correct,
> please show me where I read something wrong in the specification.

You are saying that even when I do:

/* this is our request */
rq->nr_sectors = 48;
rq->current_nr_sectors = 8;

/* drive->mult_count has been programmed to 16 */

/* bla bla command setup */
OUT_BYTE(rq->current_nr_sectors, IDE_NSECTOR_REG);
ide_set_hander(...);
OUT_BYTE(WIN_MULTREAD, IDE_COMMAND_REG);

The drive will be wanting to transfer _16_ sectors, even though I told
it that I want _8_. This sounds very strange to me, and it means that
2.2/2.4 etc should have never worked in multi mode. I'll go find the
spec now... I am just talking out of my ass.

> > > *****************************
> > > The questions to ask "How would the host terminate a command in progress,
> > > since BSY=1 (or DRQ=1) at this point? Is that done via a DEVICE_RESET or
> > > SRST write?"
> >
> > [snip]
> >
> > Moot, there's no premature termination going on.
>
> >From the OS/HOST side you are 100% correct.

Yep

> >From the device side, do you know that for a fact?

No

> Please read the difference in the two state-machine diagrams, the have the
> same name phasing, but each describes which end of the cable you are on
> and the expected behavors.

I will do so now, I think I've stated my speculation above and in
earlier mails :-)

--
Jens Axboe

2002-01-21 08:48:53

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Jens Axboe wrote:

> On Sun, Jan 20 2002, Andre Hedrick wrote:
> > > No it's not. By your standards, that would mean that if the device is
> > > setup for 16 sector multi mode, then I could never ever issue requests
> > > less than that (without doing some crap 'toss away extra data' stuff).
> > > How else would you handle, eg, 2 sector requests with multi mode set?
> >
> > Change the opcode in the command block to single sector, if
> > rq->current_nr_sectors != drive->multcount.
>
> That crossed my mind too, however that's not what we've been doing in
> the past and multi mode has worked fine.

And we have not done a lot of things in the past.
Mind the fact, before you changed max-sectors from 128 to 255 != 256, he
problems maybe a direct result. Mind the fact, it is my fault for not
telling you about the issue.

Since 128 and 256 are clearly 2,4,8,16 divisable and clean, as a result we
kind of masked the problem, but 255 is not at all the same issue.

Mind you Mark Lord did get this correct in the copy buffer issue, but the
bug introduced by 255 is the only problem I can trace to be suspect.

> > > > The effective operations your changes have created without addressing all
> > > > the variables is to terminate the command in process. Therefore, the
> > > > decision made by you was to restrict the transfers to be process to the
> > > > count in rq->current_nr_sectors. There is no bounds checking based on the
> > > > command executed.
> > >
> > > I'm not stopping a request in progress. I told the drive that the
> > > request is current_nr_sectors big, so once it finishes transferring
> > > current_nr_sectors sectors it truly thinks it's really done with that
> > > request. And it is. However, I'm leaving the request on the queue (or,
> > > really, ide_end_request is not taking it off because
> > > end_that_request_first is not indicating it's complete). So I'm simply
> > > starting from scratch with the remaining data. See?
> >
> > I know what you are doing, and I am trying to mate the requirement to use
>
> Yes

Good we are still in agreement.

> > the hardware to what you are sending down. The question you need to
> > answer is issuing a request for multi-sector transfers less than what the
> > device is expecting, sane and correct. If you tell me it is correct,
> > please show me where I read something wrong in the specification.
>
> You are saying that even when I do:
>
> /* this is our request */
> rq->nr_sectors = 48;
> rq->current_nr_sectors = 8;
>
> /* drive->mult_count has been programmed to 16 */

You exectute WIN_MULTREAD and it behaves based on what the device has been
programmed to do respond.

If you want 8 sectors only, by golly you had better tell it expect 8
sectors and then you can interrupt upon completion.

If it expects 16 sectors and you stop at 8, and issue a new command,
expect the device to go south.

> /* bla bla command setup */
> OUT_BYTE(rq->current_nr_sectors, IDE_NSECTOR_REG);
> ide_set_hander(...);
> OUT_BYTE(WIN_MULTREAD, IDE_COMMAND_REG);
>
> The drive will be wanting to transfer _16_ sectors, even though I told
> it that I want _8_. This sounds very strange to me, and it means that
> 2.2/2.4 etc should have never worked in multi mode. I'll go find the
> spec now... I am just talking out of my ass.

See above. And if the DRIVE or SPEC is wrong because we are doing it by
the book, we know where to raise a stink.

> > > > *****************************
> > > > The questions to ask "How would the host terminate a command in progress,
> > > > since BSY=1 (or DRQ=1) at this point? Is that done via a DEVICE_RESET or
> > > > SRST write?"
> > >
> > > [snip]
> > >
> > > Moot, there's no premature termination going on.
> >
> > >From the OS/HOST side you are 100% correct.
>
> Yep
>
> > >From the device side, do you know that for a fact?
>
> No
>
> > Please read the difference in the two state-machine diagrams, the have the
> > same name phasing, but each describes which end of the cable you are on
> > and the expected behavors.
>
> I will do so now, I think I've stated my speculation above and in
> earlier mails :-)

ERM, no ... This is a classic miscommunication event.
You have the analyzers, look for yourself.


Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 09:01:15

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1


(have read up on mult now)

On Mon, Jan 21 2002, Andre Hedrick wrote:
> > On Sun, Jan 20 2002, Andre Hedrick wrote:
> > > > No it's not. By your standards, that would mean that if the device is
> > > > setup for 16 sector multi mode, then I could never ever issue requests
> > > > less than that (without doing some crap 'toss away extra data' stuff).
> > > > How else would you handle, eg, 2 sector requests with multi mode set?
> > >
> > > Change the opcode in the command block to single sector, if
> > > rq->current_nr_sectors != drive->multcount.
> >
> > That crossed my mind too, however that's not what we've been doing in
> > the past and multi mode has worked fine.
>
> And we have not done a lot of things in the past.
> Mind the fact, before you changed max-sectors from 128 to 255 != 256, he
> problems maybe a direct result. Mind the fact, it is my fault for not
> telling you about the issue.
>
> Since 128 and 256 are clearly 2,4,8,16 divisable and clean, as a result we
> kind of masked the problem, but 255 is not at all the same issue.

But, eg, 24 sectors is not and we would still be starting a multi
read/write for that...

> Mind you Mark Lord did get this correct in the copy buffer issue, but the
> bug introduced by 255 is the only problem I can trace to be suspect.

255 is effectively 248 (256 - 8), however that is still not correct when
modulo a 16 multi sector setting.

> > > the hardware to what you are sending down. The question you need to
> > > answer is issuing a request for multi-sector transfers less than what the
> > > device is expecting, sane and correct. If you tell me it is correct,
> > > please show me where I read something wrong in the specification.
> >
> > You are saying that even when I do:
> >
> > /* this is our request */
> > rq->nr_sectors = 48;
> > rq->current_nr_sectors = 8;
> >
> > /* drive->mult_count has been programmed to 16 */
>
> You exectute WIN_MULTREAD and it behaves based on what the device has been
> programmed to do respond.
>
> If you want 8 sectors only, by golly you had better tell it expect 8
> sectors and then you can interrupt upon completion.
>
> If it expects 16 sectors and you stop at 8, and issue a new command,
> expect the device to go south.

This really sucks, it means we cannot safely use multi mode for a
variety of request sizes. I agree with your earlier comment. Here's what
I think we should be doing: when requesting multi mode, limit to 8
sectors like in your patch. This is by far the most commen multiple,
that's why. When starting a request, use WIN_MULT* only for cases where
(rq->nr_sectors % drive->mult_count) == 0. If that doesn't hold, simply
use WIN_READ or WIN_WRITE.

Applied the 2.5.3-pre2 sched SMP fix, booting -pre2 and then hacking up
a patch.

--
Jens Axboe

2002-01-21 09:05:46

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Jens Axboe wrote:

>
> (have read up on mult now)
>
> On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > On Sun, Jan 20 2002, Andre Hedrick wrote:
> > > > > No it's not. By your standards, that would mean that if the device is
> > > > > setup for 16 sector multi mode, then I could never ever issue requests
> > > > > less than that (without doing some crap 'toss away extra data' stuff).
> > > > > How else would you handle, eg, 2 sector requests with multi mode set?
> > > >
> > > > Change the opcode in the command block to single sector, if
> > > > rq->current_nr_sectors != drive->multcount.
> > >
> > > That crossed my mind too, however that's not what we've been doing in
> > > the past and multi mode has worked fine.
> >
> > And we have not done a lot of things in the past.
> > Mind the fact, before you changed max-sectors from 128 to 255 != 256, he
> > problems maybe a direct result. Mind the fact, it is my fault for not
> > telling you about the issue.
> >
> > Since 128 and 256 are clearly 2,4,8,16 divisable and clean, as a result we
> > kind of masked the problem, but 255 is not at all the same issue.
>
> But, eg, 24 sectors is not and we would still be starting a multi
> read/write for that...
>
> > Mind you Mark Lord did get this correct in the copy buffer issue, but the
> > bug introduced by 255 is the only problem I can trace to be suspect.
>
> 255 is effectively 248 (256 - 8), however that is still not correct when
> modulo a 16 multi sector setting.
>
> > > > the hardware to what you are sending down. The question you need to
> > > > answer is issuing a request for multi-sector transfers less than what the
> > > > device is expecting, sane and correct. If you tell me it is correct,
> > > > please show me where I read something wrong in the specification.
> > >
> > > You are saying that even when I do:
> > >
> > > /* this is our request */
> > > rq->nr_sectors = 48;
> > > rq->current_nr_sectors = 8;
> > >
> > > /* drive->mult_count has been programmed to 16 */
> >
> > You exectute WIN_MULTREAD and it behaves based on what the device has been
> > programmed to do respond.
> >
> > If you want 8 sectors only, by golly you had better tell it expect 8
> > sectors and then you can interrupt upon completion.
> >
> > If it expects 16 sectors and you stop at 8, and issue a new command,
> > expect the device to go south.
>
> This really sucks, it means we cannot safely use multi mode for a
> variety of request sizes. I agree with your earlier comment. Here's what
> I think we should be doing: when requesting multi mode, limit to 8
> sectors like in your patch. This is by far the most commen multiple,
> that's why. When starting a request, use WIN_MULT* only for cases where
> (rq->nr_sectors % drive->mult_count) == 0. If that doesn't hold, simply
> use WIN_READ or WIN_WRITE.
>
> Applied the 2.5.3-pre2 sched SMP fix, booting -pre2 and then hacking up
> a patch.

Why I have already done it, just take and apply.

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 09:07:36

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Andre Hedrick wrote:
> > This really sucks, it means we cannot safely use multi mode for a
> > variety of request sizes. I agree with your earlier comment. Here's what
> > I think we should be doing: when requesting multi mode, limit to 8
> > sectors like in your patch. This is by far the most commen multiple,
> > that's why. When starting a request, use WIN_MULT* only for cases where
> > (rq->nr_sectors % drive->mult_count) == 0. If that doesn't hold, simply
> > use WIN_READ or WIN_WRITE.
> >
> > Applied the 2.5.3-pre2 sched SMP fix, booting -pre2 and then hacking up
> > a patch.
>
> Why I have already done it, just take and apply.

Cool, where is it?

--
Jens Axboe

2002-01-21 09:54:42

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Jens Axboe wrote:

> On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > This really sucks, it means we cannot safely use multi mode for a
> > > variety of request sizes. I agree with your earlier comment. Here's what
> > > I think we should be doing: when requesting multi mode, limit to 8
> > > sectors like in your patch. This is by far the most commen multiple,
> > > that's why. When starting a request, use WIN_MULT* only for cases where
> > > (rq->nr_sectors % drive->mult_count) == 0. If that doesn't hold, simply
> > > use WIN_READ or WIN_WRITE.
> > >
> > > Applied the 2.5.3-pre2 sched SMP fix, booting -pre2 and then hacking up
> > > a patch.
> >
> > Why I have already done it, just take and apply.
>
> Cool, where is it?

Attached, and please do not pick and choose.

I moved and added things for a reason as not to loose hard work, because
of writing the ISR's to the purity of the spec, and then we modify ISR's
to fit the kernel and not the other way around. I do have a just reason
to request a MEMPOOL, which would be exclusively used for PIO operations.
Then we get out of the mess we are in and get in to serious compliance to
how the hardware works.

Thus in the offline comments about the creation of an ata_request_struct,
a mempool allocation for PIO is justified. Since the correct solution of
DMA timeouts is to void the request and assume no data down is valid.
Thus PIO is next.

If we look at the overhead in the generation of a new request for each and
every time we do a PIO transfer it is scary. Think about this issue for
more than the time it takes to hit the delete key.

Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 10:43:36

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Sun, Jan 20, 2002 at 04:12:36PM -0800, Andre Hedrick wrote:

> > > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > > to want to go south, thus [lost interrupt]
> > >
> > > Even if the drive is programmed for 16 sectors in multi mode, it still
> > > must honor lower transfer sizes. The fix I did was not to limit this,
> > > but rather to only setup transfers for the amount of sectors in the
> > > first chunk. This is indeed necessary now that we do not have a copy of
> > > the request to fool around with.
>
> Listen and for just a second okay.
>
> Since the set multimode command is similar to the set transfer rate, if
> you program the drive to run at U100 but the host can feed only U33 you
> have problems. Much of this simple arguement is the same answer for
> multimode.
>
> Same thing here but a variation, of the operations,

So you're saying that if you program the drive to multimode 16, you
can't read a single sector and always have to read 16? That not only
doesn't make sense to me, but it also contradicts anything that I've
heard before.

--
Vojtech Pavlik
SuSE Labs

2002-01-21 10:49:16

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Vojtech Pavlik wrote:
> On Sun, Jan 20, 2002 at 04:12:36PM -0800, Andre Hedrick wrote:
>
> > > > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > > > to want to go south, thus [lost interrupt]
> > > >
> > > > Even if the drive is programmed for 16 sectors in multi mode, it still
> > > > must honor lower transfer sizes. The fix I did was not to limit this,
> > > > but rather to only setup transfers for the amount of sectors in the
> > > > first chunk. This is indeed necessary now that we do not have a copy of
> > > > the request to fool around with.
> >
> > Listen and for just a second okay.
> >
> > Since the set multimode command is similar to the set transfer rate, if
> > you program the drive to run at U100 but the host can feed only U33 you
> > have problems. Much of this simple arguement is the same answer for
> > multimode.
> >
> > Same thing here but a variation, of the operations,
>
> So you're saying that if you program the drive to multimode 16, you
> can't read a single sector and always have to read 16? That not only
> doesn't make sense to me, but it also contradicts anything that I've
> heard before.

Well it didn't/doesn't make sense to me either, let me quote spec
though:

(READ_MULTIPLE)

"If the number of requested sectors is not evenly divisible by the block
count, as many full blocks as possible are transferred, followed by a
final, partial block transfer."

(block count being the multi setting here)

I actually misread this the first time around, it seems my original code
was indeed correct (and that 2.4 of course also is). For the example 24
sector request and multi mode of 16, the drive _will_ only expect 8
sectors in the final run. That makes sense to me again, I couldn't
understand the apparent brain damage in the model Andre suggested.

Time for a new patch...

--
Jens Axboe

2002-01-21 10:56:26

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Jens Axboe wrote:
> Time for a new patch...

Actually, then I did get it right in 2.5.3-pre2 so no issues. Only
problem is the 48-bit addressing nr_sectors bug, however that can't hit
right now so it's not an issue.

That just leaves Davide's lost interrupt issue, lets look into that
now...

--
Jens Axboe

2002-01-21 11:15:37

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21, 2002 at 11:48:30AM +0100, Jens Axboe wrote:
> On Mon, Jan 21 2002, Vojtech Pavlik wrote:
> > On Sun, Jan 20, 2002 at 04:12:36PM -0800, Andre Hedrick wrote:
> >
> > > > > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > > > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > > > > to want to go south, thus [lost interrupt]
> > > > >
> > > > > Even if the drive is programmed for 16 sectors in multi mode, it still
> > > > > must honor lower transfer sizes. The fix I did was not to limit this,
> > > > > but rather to only setup transfers for the amount of sectors in the
> > > > > first chunk. This is indeed necessary now that we do not have a copy of
> > > > > the request to fool around with.
> > >
> > > Listen and for just a second okay.
> > >
> > > Since the set multimode command is similar to the set transfer rate, if
> > > you program the drive to run at U100 but the host can feed only U33 you
> > > have problems. Much of this simple arguement is the same answer for
> > > multimode.
> > >
> > > Same thing here but a variation, of the operations,
> >
> > So you're saying that if you program the drive to multimode 16, you
> > can't read a single sector and always have to read 16? That not only
> > doesn't make sense to me, but it also contradicts anything that I've
> > heard before.
>
> Well it didn't/doesn't make sense to me either, let me quote spec
> though:
>
> (READ_MULTIPLE)
>
> "If the number of requested sectors is not evenly divisible by the block
> count, as many full blocks as possible are transferred, followed by a
> final, partial block transfer."
>
> (block count being the multi setting here)
>
> I actually misread this the first time around, it seems my original code
> was indeed correct (and that 2.4 of course also is). For the example 24
> sector request and multi mode of 16, the drive _will_ only expect 8
> sectors in the final run. That makes sense to me again, I couldn't
> understand the apparent brain damage in the model Andre suggested.
>
> Time for a new patch...

I always thought it is like this (and this is what I still believe after
having read the sprcification):

---
SET_MUTIPLE 16 sectors
---
READ_MULTIPLE 24 sectors
IRQ
PIO transfer 16 sectors
IRQ
PIO transfer 8 sectors
---

Where am I wrong?

By the way, the device *isn't* required to support any lower multiple
count than the maximum one it advertizes. Ugly.

--
Vojtech Pavlik
SuSE Labs

2002-01-21 11:29:11

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Vojtech Pavlik wrote:

> On Sun, Jan 20, 2002 at 04:12:36PM -0800, Andre Hedrick wrote:
>
> > > > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > > > to want to go south, thus [lost interrupt]
> > > >
> > > > Even if the drive is programmed for 16 sectors in multi mode, it still
> > > > must honor lower transfer sizes. The fix I did was not to limit this,
> > > > but rather to only setup transfers for the amount of sectors in the
> > > > first chunk. This is indeed necessary now that we do not have a copy of
> > > > the request to fool around with.
> >
> > Listen and for just a second okay.
> >
> > Since the set multimode command is similar to the set transfer rate, if
> > you program the drive to run at U100 but the host can feed only U33 you
> > have problems. Much of this simple arguement is the same answer for
> > multimode.
> >
> > Same thing here but a variation, of the operations,
>
> So you're saying that if you program the drive to multimode 16, you
> can't read a single sector and always have to read 16? That not only
> doesn't make sense to me, but it also contradicts anything that I've
> heard before.

Vojtech,

If the device is programmed for to do 16 sectors in multimode, it and you
issue a read/write multiple pio and short change the device it is not
going to like it. However if it is programmed for multimode and you issue
single sector pio transfers command opcodes it is fine.

Do we differ?

Regards,



Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 11:30:42

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Vojtech Pavlik wrote:
> I always thought it is like this (and this is what I still believe after
> having read the sprcification):
>
> ---
> SET_MUTIPLE 16 sectors
> ---
> READ_MULTIPLE 24 sectors
> IRQ
> PIO transfer 16 sectors
> IRQ
> PIO transfer 8 sectors
> ---
>
> Where am I wrong?

I agree completely, see previous mail.

> By the way, the device *isn't* required to support any lower multiple
> count than the maximum one it advertizes. Ugly.

Oh? That basically narrows down the multi count value from hdparm to a
boolean on-or-off. I'd be surprised to see any drives break with lower
multi count in real life, though..

--
Jens Axboe

2002-01-21 11:33:42

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21, 2002 at 03:22:20AM -0800, Andre Hedrick wrote:
> On Mon, 21 Jan 2002, Vojtech Pavlik wrote:
>
> > On Sun, Jan 20, 2002 at 04:12:36PM -0800, Andre Hedrick wrote:
> >
> > > > > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > > > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > > > > to want to go south, thus [lost interrupt]
> > > > >
> > > > > Even if the drive is programmed for 16 sectors in multi mode, it still
> > > > > must honor lower transfer sizes. The fix I did was not to limit this,
> > > > > but rather to only setup transfers for the amount of sectors in the
> > > > > first chunk. This is indeed necessary now that we do not have a copy of
> > > > > the request to fool around with.
> > >
> > > Listen and for just a second okay.
> > >
> > > Since the set multimode command is similar to the set transfer rate, if
> > > you program the drive to run at U100 but the host can feed only U33 you
> > > have problems. Much of this simple arguement is the same answer for
> > > multimode.
> > >
> > > Same thing here but a variation, of the operations,
> >
> > So you're saying that if you program the drive to multimode 16, you
> > can't read a single sector and always have to read 16? That not only
> > doesn't make sense to me, but it also contradicts anything that I've
> > heard before.
>
> Vojtech,
>
> If the device is programmed for to do 16 sectors in multimode, it and you
> issue a read/write multiple pio and short change the device it is not
> going to like it. However if it is programmed for multimode and you issue
> single sector pio transfers command opcodes it is fine.
>
> Do we differ?

I think so. Check my mail from 11:14:56 GMT today. I fully understand
that if I supply less data to the device than it expects or get less
from it than it has, it'll be a problem. But I think the specification
doesn't prohibit reading amounts not divisible by multimode setting via
the multimode command. I've read it quite carefully again.

--
Vojtech Pavlik
SuSE Labs

2002-01-21 11:35:02

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Vojtech Pavlik wrote:
> > If the device is programmed for to do 16 sectors in multimode, it and you
> > issue a read/write multiple pio and short change the device it is not
> > going to like it. However if it is programmed for multimode and you issue
> > single sector pio transfers command opcodes it is fine.
> >
> > Do we differ?
>
> I think so. Check my mail from 11:14:56 GMT today. I fully understand
> that if I supply less data to the device than it expects or get less
> from it than it has, it'll be a problem. But I think the specification
> doesn't prohibit reading amounts not divisible by multimode setting via
> the multimode command. I've read it quite carefully again.

Like I said, if this was indeed a problem it would be _trivial_ to break
2.2/2.4 IDE with enabled multi mode... Basically it would be hard to get
_anything_ done.

--
Jens Axboe

2002-01-21 11:40:52

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21, 2002 at 12:29:54PM +0100, Jens Axboe wrote:
> On Mon, Jan 21 2002, Vojtech Pavlik wrote:
> > I always thought it is like this (and this is what I still believe after
> > having read the sprcification):
> >
> > ---
> > SET_MUTIPLE 16 sectors
> > ---
> > READ_MULTIPLE 24 sectors
> > IRQ
> > PIO transfer 16 sectors
> > IRQ
> > PIO transfer 8 sectors
> > ---
> >
> > Where am I wrong?
>
> I agree completely, see previous mail.
>
> > By the way, the device *isn't* required to support any lower multiple
> > count than the maximum one it advertizes. Ugly.
>
> Oh? That basically narrows down the multi count value from hdparm to a
> boolean on-or-off. I'd be surprised to see any drives break with lower
> multi count in real life, though..

The spec seems to mandate to check the Identify data again after setting
new Multmode to see whether the drive supported the value we wanted to
program it to.

--
Vojtech Pavlik
SuSE Labs

2002-01-21 11:40:42

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Vojtech Pavlik wrote:

> On Mon, Jan 21, 2002 at 11:48:30AM +0100, Jens Axboe wrote:
> > On Mon, Jan 21 2002, Vojtech Pavlik wrote:
> > > On Sun, Jan 20, 2002 at 04:12:36PM -0800, Andre Hedrick wrote:
> > >
> > > > > > > We only read out 4k thus the device has the the next 4k we may be wanting
> > > > > > > ready. Look at it as a dirty prefetch, but eventally the drive is going
> > > > > > > to want to go south, thus [lost interrupt]
> > > > > >
> > > > > > Even if the drive is programmed for 16 sectors in multi mode, it still
> > > > > > must honor lower transfer sizes. The fix I did was not to limit this,
> > > > > > but rather to only setup transfers for the amount of sectors in the
> > > > > > first chunk. This is indeed necessary now that we do not have a copy of
> > > > > > the request to fool around with.
> > > >
> > > > Listen and for just a second okay.
> > > >
> > > > Since the set multimode command is similar to the set transfer rate, if
> > > > you program the drive to run at U100 but the host can feed only U33 you
> > > > have problems. Much of this simple arguement is the same answer for
> > > > multimode.
> > > >
> > > > Same thing here but a variation, of the operations,
> > >
> > > So you're saying that if you program the drive to multimode 16, you
> > > can't read a single sector and always have to read 16? That not only
> > > doesn't make sense to me, but it also contradicts anything that I've
> > > heard before.
> >
> > Well it didn't/doesn't make sense to me either, let me quote spec
> > though:
> >
> > (READ_MULTIPLE)
> >
> > "If the number of requested sectors is not evenly divisible by the block
> > count, as many full blocks as possible are transferred, followed by a
> > final, partial block transfer."
> >
> > (block count being the multi setting here)
> >
> > I actually misread this the first time around, it seems my original code
> > was indeed correct (and that 2.4 of course also is). For the example 24
> > sector request and multi mode of 16, the drive _will_ only expect 8
> > sectors in the final run. That makes sense to me again, I couldn't
> > understand the apparent brain damage in the model Andre suggested.
> >
> > Time for a new patch...
>
> I always thought it is like this (and this is what I still believe after
> having read the sprcification):
>
> ---
> SET_MUTIPLE 16 sectors
> ---
> READ_MULTIPLE 24 sectors
> IRQ
> PIO transfer 16 sectors
> IRQ
> PIO transfer 8 sectors
> ---
>
> Where am I wrong?
>
> By the way, the device *isn't* required to support any lower multiple
> count than the maximum one it advertizes. Ugly.

No but the HOST is to obey the requirements of the device.
The spec is written from the drive side not the host side.

"All Ye Hosts, SHALL address me in such a manner as described, or be
aborted or I SHALL remain in an undertermined state."

Note only recently have the HOSTS been about to setup guidelines for what
is sane and not stupid for the device to do or behave.

Again, the HOST(Linux) is not following the device side rules so expect
difficulty when we depart. The Brain Damage is how to talk to the
hardware, and it is clear we are not doing it right because we are bending
the rules stuff it into and API that not acceptable. However we are
stuck. Again, simplicity works, generate a MEMPOOL for PIO such that the
buffer pages are contigious and the 4k page dance is a NOOP. Until that
time we will be fussing about.

Regards,


Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 12:10:46

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Vojtech Pavlik wrote:

> On Mon, Jan 21, 2002 at 12:29:54PM +0100, Jens Axboe wrote:
> > On Mon, Jan 21 2002, Vojtech Pavlik wrote:
> > > I always thought it is like this (and this is what I still believe after
> > > having read the sprcification):
> > >
> > > ---
> > > SET_MUTIPLE 16 sectors
> > > ---
> > > READ_MULTIPLE 24 sectors
> > > IRQ
> > > PIO transfer 16 sectors
> > > IRQ
> > > PIO transfer 8 sectors
> > > ---
> > >
> > > Where am I wrong?
> >
> > I agree completely, see previous mail.
> >
> > > By the way, the device *isn't* required to support any lower multiple
> > > count than the maximum one it advertizes. Ugly.
> >
> > Oh? That basically narrows down the multi count value from hdparm to a
> > boolean on-or-off. I'd be surprised to see any drives break with lower
> > multi count in real life, though..
>
> The spec seems to mandate to check the Identify data again after setting
> new Multmode to see whether the drive supported the value we wanted to
> program it to.

Forget the TEXT in the command description, cause what you are looking for
is not there. It is stated and expressed in the state-diagrams, only.
There is minimal text, and only timing profiles for the device
manufacturers. How to make it go is in the pictures and the text is only
supporting information.

That is why it is so painful to decode, and why it does not fit into the
requirements of early return of ever 4k or page of data back to the upper
layers. So if we can not do the entire transfer w/ contigious memory we
are forced into this game of jump through the hoops.

Regards,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 17:38:34

by Davide Libenzi

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Jens Axboe wrote:

> On Mon, Jan 21 2002, Jens Axboe wrote:
> > Time for a new patch...
>
> Actually, then I did get it right in 2.5.3-pre2 so no issues. Only
> problem is the 48-bit addressing nr_sectors bug, however that can't hit
> right now so it's not an issue.
>
> That just leaves Davide's lost interrupt issue, lets look into that
> now...

Guys, am i the only fool in town getting this one ?



- Davide


2002-01-21 17:45:44

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Andre Hedrick wrote:
> > I always thought it is like this (and this is what I still believe after
> > having read the sprcification):
> >
> > ---
> > SET_MUTIPLE 16 sectors
> > ---
> > READ_MULTIPLE 24 sectors
> > IRQ
> > PIO transfer 16 sectors
> > IRQ
> > PIO transfer 8 sectors
> > ---
> >
> > Where am I wrong?
> >
> > By the way, the device *isn't* required to support any lower multiple
> > count than the maximum one it advertizes. Ugly.
>
> No but the HOST is to obey the requirements of the device.
> The spec is written from the drive side not the host side.
>
> "All Ye Hosts, SHALL address me in such a manner as described, or be
> aborted or I SHALL remain in an undertermined state."
>
> Note only recently have the HOSTS been about to setup guidelines for what
> is sane and not stupid for the device to do or behave.
>
> Again, the HOST(Linux) is not following the device side rules so expect
> difficulty when we depart. The Brain Damage is how to talk to the
> hardware, and it is clear we are not doing it right because we are bending
> the rules stuff it into and API that not acceptable. However we are
> stuck. Again, simplicity works, generate a MEMPOOL for PIO such that the
> buffer pages are contigious and the 4k page dance is a NOOP. Until that
> time we will be fussing about.

Andre,

Do you know how to say "I was wrong"? You are walking off-track again.
It's clearly the way that Vojtech and I describe, otherwise current code
would just not work. And 2.4, 2.2, 2.0 neither.

--
Jens Axboe

2002-01-21 20:36:12

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Jens Axboe wrote:

> On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > I always thought it is like this (and this is what I still believe after
> > > having read the sprcification):
> > >
> > > ---
> > > SET_MUTIPLE 16 sectors
> > > ---
> > > READ_MULTIPLE 24 sectors
> > > IRQ
> > > PIO transfer 16 sectors
> > > IRQ
> > > PIO transfer 8 sectors
> > > ---
> > >
> > > Where am I wrong?
> > >
> > > By the way, the device *isn't* required to support any lower multiple
> > > count than the maximum one it advertizes. Ugly.
> >
> > No but the HOST is to obey the requirements of the device.
> > The spec is written from the drive side not the host side.
> >
> > "All Ye Hosts, SHALL address me in such a manner as described, or be
> > aborted or I SHALL remain in an undertermined state."
> >
> > Note only recently have the HOSTS been about to setup guidelines for what
> > is sane and not stupid for the device to do or behave.
> >
> > Again, the HOST(Linux) is not following the device side rules so expect
> > difficulty when we depart. The Brain Damage is how to talk to the
> > hardware, and it is clear we are not doing it right because we are bending
> > the rules stuff it into and API that not acceptable. However we are
> > stuck. Again, simplicity works, generate a MEMPOOL for PIO such that the
> > buffer pages are contigious and the 4k page dance is a NOOP. Until that
> > time we will be fussing about.
>
> Andre,
>
> Do you know how to say "I was wrong"? You are walking off-track again.
> It's clearly the way that Vojtech and I describe, otherwise current code
> would just not work. And 2.4, 2.2, 2.0 neither.

I will and have done so in the past when I am, and it would be nice if you
and Linus could do the same. However since both are going to enforce the
partial completion of IO on page boundaries or 4k, and you are not
allowed to pause or stop in the middle of a command execution to play
memory games under ATA/IDE PIO rules, period.

You two have limited me to having a single 4k or one page of memory per
request. Regardless if the rq->buffer pointer list is handed to me, I can
not use or walk it with out having to jump in and out of a memory window.

So if you want partial completions of 4k boundaries then do not send down
requests bigger than 4k, or give me back the rq scratch pad copy (and I do
not want it back, it is lame), or grant the mempool for the atomic io of
the request and it must be contigious to the allow a clean walk of the
buffer from head to tail.

<insert your reasons why it will never be granted here>

Linux's requirements are against the hardware, thus why it does not work.

I have stated if (rq->current_nr_sectors != drive_multicount) do single
sector and restrict the data-io to rq->current_nr_sectors.
So it is clear for everyone "rq->current_nr_sectors" is bounded to be
1,2,3,4,5,6,7,8 sectors of data io. It is not allowed to do any more
because is of the 4k rule.

Now if we want to do multi_count, on a dynamic range from 1-8 because that
is the range of "rq->current_nr_sectors" fine state it clearly and I will
adjust. Since we are in PIO we have already had screwups some where, and
so grinding the IO to a single page or less is okay, but you two take the
heat for the performance kill-joy.

Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 22:03:54

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Jens Axboe wrote:

> On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > I always thought it is like this (and this is what I still believe after
> > > having read the sprcification):
> > >
> > > ---
> > > SET_MUTIPLE 16 sectors
> > > ---
> > > READ_MULTIPLE 24 sectors
> > > IRQ
> > > PIO transfer 16 sectors
> > > IRQ
> > > PIO transfer 8 sectors
> > > ---
> > >
> > > Where am I wrong?
> > >
> > > By the way, the device *isn't* required to support any lower multiple
> > > count than the maximum one it advertizes. Ugly.
> >
> > No but the HOST is to obey the requirements of the device.
> > The spec is written from the drive side not the host side.
> >
> > "All Ye Hosts, SHALL address me in such a manner as described, or be
> > aborted or I SHALL remain in an undertermined state."
> >
> > Note only recently have the HOSTS been about to setup guidelines for what
> > is sane and not stupid for the device to do or behave.
> >
> > Again, the HOST(Linux) is not following the device side rules so expect
> > difficulty when we depart. The Brain Damage is how to talk to the
> > hardware, and it is clear we are not doing it right because we are bending
> > the rules stuff it into and API that not acceptable. However we are
> > stuck. Again, simplicity works, generate a MEMPOOL for PIO such that the
> > buffer pages are contigious and the 4k page dance is a NOOP. Until that
> > time we will be fussing about.
>
> Andre,
>
> Do you know how to say "I was wrong"? You are walking off-track again.
> It's clearly the way that Vojtech and I describe, otherwise current code
> would just not work. And 2.4, 2.2, 2.0 neither.

255 * 512bytes != 128K BUG
256 * 512bytes == 128K

You insure we will fail on alignemnt.

You have stated BLOCK can not deal with correct sector alignments, and
thus 255 so please fix it first. I have accepted this brokeness in BLOCK
and dropped to 128 sectors or a clean 64k.

If we restrict multi-sector PIO to 8 sectors we can do multi interrupt
ATOMIC disk IO on the paging alignments, but you have enforced single
sector IO in the multi-sector writing and can not see the difference.
If rq->current_nr_sectors is less than 8 we do PIO single sector IO, but
we are doing that now w/ the copy paste changes from the old ide-disk.c
stuff that we are attempting deleting.

You are making mistakes left and right because you think you understand
the hardware. I thought we had an agreement, BLOCK stops at DO_REQUEST.
Now you are altering the driver core, and the ISR's. BLOCK has no
business in dictating how to talk to the hardware, especially since it
violates the specification willfully and without need.

We do a DMA of two PRD's of 128 sectors and 127 sectors, thus a mess.

So at this point pull it and put back the munge for before and I will fix
it completely and return a turn-key, now that I understand the brokeness
of the interface I am deal w/ on both sides.

Until you understand the execution of the command block is ATOMIC it will
never work. Also when the SCSI-MID Layer is deleted, you will have a
repeat of this issue on a much grander scale. Eric was a brilliant to
hide the nature of the transport layer in the SCSI-MID Layer and return
back partial completion against his ATOMIC Command IO calls.

Had I been as clever as him in the past nobody would know the difference.

Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-21 22:46:16

by Petr Vandrovec

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

I did not want to participate in this discussion, as it is probably
impossible to explain to you that there is nothing wrong with doing
requests not evenly divisible by block size.

On 21 Jan 02 at 13:44, Andre Hedrick wrote:
> On Mon, 21 Jan 2002, Jens Axboe wrote:
> > On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > > I always thought it is like this (and this is what I still believe after
> > > > having read the sprcification):
> > > >
> > > > ---
> > > > SET_MUTIPLE 16 sectors
> > > > ---
> > > > READ_MULTIPLE 24 sectors
> > > > IRQ
> > > > PIO transfer 16 sectors
> > > > IRQ
> > > > PIO transfer 8 sectors
> > > > ---
>
> 255 * 512bytes != 128K BUG
> 256 * 512bytes == 128K
>
> You insure we will fail on alignemnt.

SET MULTIPLE MODE description says that host should try block size only
1,2,4,8,16,32,64 or 128 sectors. So where you got 255 from?

> You have stated BLOCK can not deal with correct sector alignments, and
> thus 255 so please fix it first. I have accepted this brokeness in BLOCK
> and dropped to 128 sectors or a clean 64k.
>
> If we restrict multi-sector PIO to 8 sectors we can do multi interrupt
> ATOMIC disk IO on the paging alignments, but you have enforced single
> sector IO in the multi-sector writing and can not see the difference.

Why we cannot do multi-sector PIO with 16...128 sectors? There is no need
to read all data with one insw() loop, you can store each of these
64kB of data in 65536 different, non-continuous, locations, and ATA device
will not complain, as it will always see 32768 of word reads from its
data port, nothing else... And no, there is no requirement that host
must do back-to-back reads or writes from ATA device data port. Otherwise
we would see upper bound on t0 in PIO-in and PIO-out cycles description.

> If rq->current_nr_sectors is less than 8 we do PIO single sector IO, but
> we are doing that now w/ the copy paste changes from the old ide-disk.c
> stuff that we are attempting deleting.

Please tell me what page 168 (it is number of paper page, page number
in PDF is by 14 greater) of Volume 2, Revision 0, of ATA/ATAPI rev.7
(T13/1532D) in description of READ MULTIPLE talks about?

If the number of requested sectors is not evenly divisible by the block
count, as many full blocks as possible are transferred, followed by a final,
partial block transfer. The partial block transfer shall be for n sectors,
where n = remainder (sector count/block count).

And almost identical text appears on page 296, where it talks about
WRITE MULTIPLE.

If you are trying to persuade us that there are devices which support
ATA interface, and do not follow these paragraphs word by word, there
is certainly something wrong in the ATA world...
Best regards,
Petr Vandrovec
[email protected]

2002-01-21 22:58:06

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21, 2002 at 12:18:21PM -0800, Andre Hedrick wrote:

> > > Again, the HOST(Linux) is not following the device side rules so expect
> > > difficulty when we depart. The Brain Damage is how to talk to the
> > > hardware, and it is clear we are not doing it right because we are bending
> > > the rules stuff it into and API that not acceptable. However we are
> > > stuck. Again, simplicity works, generate a MEMPOOL for PIO such that the
> > > buffer pages are contigious and the 4k page dance is a NOOP. Until that
> > > time we will be fussing about.
> >
> > Andre,
> >
> > Do you know how to say "I was wrong"? You are walking off-track again.
> > It's clearly the way that Vojtech and I describe, otherwise current code
> > would just not work. And 2.4, 2.2, 2.0 neither.
>
> I will and have done so in the past when I am, and it would be nice if you
> and Linus could do the same. However since both are going to enforce the
> partial completion of IO on page boundaries or 4k, and you are not
> allowed to pause or stop in the middle of a command execution to play
> memory games under ATA/IDE PIO rules, period.

Maybe I'm again totally off-the-track, but I see no reason why I
couldn't stop in the middle of a PIO transfer (that is anytime, not even
on a sector boundary), do whatever I wish, like change the destination
buffer or whatever, and then continue. Sure, I can't send ANY commands
to the drive, and reading the status might not be a good idea either,
but I believe I can do anything else on the system. Is there a reason
why this shouldn't be possible?

--
Vojtech Pavlik
SuSE Labs

2002-01-21 23:34:23

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Petr Vandrovec wrote:

> I did not want to participate in this discussion, as it is probably
> impossible to explain to you that there is nothing wrong with doing
> requests not evenly divisible by block size.
>
> On 21 Jan 02 at 13:44, Andre Hedrick wrote:
> > On Mon, 21 Jan 2002, Jens Axboe wrote:
> > > On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > > > I always thought it is like this (and this is what I still believe after
> > > > > having read the sprcification):
> > > > >
> > > > > ---
> > > > > SET_MUTIPLE 16 sectors
> > > > > ---
> > > > > READ_MULTIPLE 24 sectors
> > > > > IRQ
> > > > > PIO transfer 16 sectors
> > > > > IRQ
> > > > > PIO transfer 8 sectors
> > > > > ---
> >
> > 255 * 512bytes != 128K BUG
> > 256 * 512bytes == 128K
> >
> > You insure we will fail on alignemnt.
>
> SET MULTIPLE MODE description says that host should try block size only
> 1,2,4,8,16,32,64 or 128 sectors. So where you got 255 from?

/storage/src-2.5.3/linux-2.5.3-p2-pristine/drivers/ide

linux/drivers/ide/ide-probe.c

line 627

/* IDE can do up to 128K per request, pdc4030 needs smaller limit */
max_sectors = (is_pdc4030_chipset ? 127 : 255);
blk_queue_max_sectors(q, max_sectors);

/storage/src-2.5.3/linux-2.5.3-p2-pristine/include/linux/blkdev.h

_LINUX_BLKDEV_H

line 322

#define MAX_PHYS_SEGMENTS 128
#define MAX_HW_SEGMENTS 128
#define MAX_SECTORS 255


> > You have stated BLOCK can not deal with correct sector alignments, and
> > thus 255 so please fix it first. I have accepted this brokeness in BLOCK
> > and dropped to 128 sectors or a clean 64k.
> >
> > If we restrict multi-sector PIO to 8 sectors we can do multi interrupt
> > ATOMIC disk IO on the paging alignments, but you have enforced single
> > sector IO in the multi-sector writing and can not see the difference.
>
> Why we cannot do multi-sector PIO with 16...128 sectors? There is no need
> to read all data with one insw() loop, you can store each of these
> 64kB of data in 65536 different, non-continuous, locations, and ATA device
> will not complain, as it will always see 32768 of word reads from its
> data port, nothing else... And no, there is no requirement that host
> must do back-to-back reads or writes from ATA device data port. Otherwise
> we would see upper bound on t0 in PIO-in and PIO-out cycles description.

I did not state we can not do the above.
I stated it can be done using multiple interrupts and still return back
partial completion of data io, because I thought it was obvious to the
execution of the COMMAND BLOCK is ATOMIC. Specifically, once issued, it
must complete its operations. If you want to IO more than on page of
memory, the you can not have any back until all is done.

No IF's AND's or BUT's and this point is getting totally lost.

> > If rq->current_nr_sectors is less than 8 we do PIO single sector IO, but
> > we are doing that now w/ the copy paste changes from the old ide-disk.c
> > stuff that we are attempting deleting.
>
> Please tell me what page 168 (it is number of paper page, page number
> in PDF is by 14 greater) of Volume 2, Revision 0, of ATA/ATAPI rev.7
> (T13/1532D) in description of READ MULTIPLE talks about?
>
> If the number of requested sectors is not evenly divisible by the block
> count, as many full blocks as possible are transferred, followed by a final,
> partial block transfer. The partial block transfer shall be for n sectors,
> where n = remainder (sector count/block count).
>
> And almost identical text appears on page 296, where it talks about
> WRITE MULTIPLE.

This arguement is over how to interface to the kernel and will the kernel
allow the device to function.

> If you are trying to persuade us that there are devices which support
> ATA interface, and do not follow these paragraphs word by word, there
> is certainly something wrong in the ATA world...

Again not a device issue, it is a HOST-Kernel issue not providing the
correct and needed glue layer to allow these operation to happen, as you
have so clearly pointed out.

Next if rq->current_nr_sectors is less than MULTI{READ,WRITE} is set for
then why do we attempt? The hardware try to help with silly HOST affairs
but this one is way to difficult and would cross the ATOMIC nature of the
command execute.

In DMA scatter gather it is reference to page locations and it is done.

In PIO there is no scatter gather possible without a memcpy to a
contigious buffer period. Therefore under the contstraints issued bu
Linus and Jens, of access to one 4k page of memory, and a forced
requirement to return back every 4k page of memory of completion prevents
one from ever transaction more than 8 sectors per request in PIO any mode.

Regardless if the request is for 128 or 256 sectors, a maximum of 8
sectors may be written or read to any disk, and then the request is ended
early. Next the same request minus the first 8 sectors (max possible) is
run back up the block layer and re-issued in a new make_request.

start_request_sectors (255 sectors) max

make_request (start_request_sectors())

do_request()
ide-disk get (255 sectors)
block truncates to 8 sectors max
ide-taskfile
transfers 8 sectors max
end request (return 247 sectors)

upate_request(247 to be re issued, + additional max of 8)
make_request (247 to be re issued, + additional max of 8)

Now we never have a way to decide if or when the original PIO request is
completed and it is safe to return to DMA if we end up in PIO because of a
DMA failure.

This is why I am going to request for backing out again because the BLOCK
API without a MID-LAYER to buffer against the goal of the kernel,
conflicts with the hardware rules requirements. Until a satisfactory
agreement can be reached then the current direction it is going will trash
the Virtual DMA hardware coming in the future.

Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-22 00:09:07

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, 21 Jan 2002, Vojtech Pavlik wrote:

> On Mon, Jan 21, 2002 at 12:18:21PM -0800, Andre Hedrick wrote:
>
> > > > Again, the HOST(Linux) is not following the device side rules so expect
> > > > difficulty when we depart. The Brain Damage is how to talk to the
> > > > hardware, and it is clear we are not doing it right because we are bending
> > > > the rules stuff it into and API that not acceptable. However we are
> > > > stuck. Again, simplicity works, generate a MEMPOOL for PIO such that the
> > > > buffer pages are contigious and the 4k page dance is a NOOP. Until that
> > > > time we will be fussing about.
> > >
> > > Andre,
> > >
> > > Do you know how to say "I was wrong"? You are walking off-track again.
> > > It's clearly the way that Vojtech and I describe, otherwise current code
> > > would just not work. And 2.4, 2.2, 2.0 neither.
> >
> > I will and have done so in the past when I am, and it would be nice if you
> > and Linus could do the same. However since both are going to enforce the
> > partial completion of IO on page boundaries or 4k, and you are not
> > allowed to pause or stop in the middle of a command execution to play
> > memory games under ATA/IDE PIO rules, period.
>
> Maybe I'm again totally off-the-track, but I see no reason why I
> couldn't stop in the middle of a PIO transfer (that is anytime, not even
> on a sector boundary), do whatever I wish, like change the destination
> buffer or whatever, and then continue. Sure, I can't send ANY commands
> to the drive, and reading the status might not be a good idea either,
> but I believe I can do anything else on the system. Is there a reason
> why this shouldn't be possible?

Okay if the execution of the command block is ATOMIC, and we want to stop
an ATOMIC operation to go alter buffers? But that is not the real
question. The real question is do we stop and ATOMIC process to return
data of a partial completeion, and then return to a HALTED ATOMIC and hope
it will still go?

DEAD Method:
issue atomic write 255 sectors
write 8 sector or 4k or 1 page of memory

interrupt_issued
exit atomic write
update top layer buffers
return;
continue write_loop;
exit on completion and update remainder.

BASTARDIZED Method:
issue write 255 sectors
truncate to max of 8 sectors
issue atomic write 8 sectors
interrupt_issued
end request and notify 4k page complete
make new request and merge and repeat.
note there is a new memcpy fo new request. (max 16 to completion)


OLD Method, with Request Page Walking:
issue atomic write 255 sectors
write sectors
interrupt_issued
walk copy of request
continue write_loop;
exit on completion and request and free local buffer.

CORRECT Method:
collect contigious physical buffer of 255 sectors
memcpy_to_local (one memcpy)
issue atomic write 255 sectors
write sectors
interrupt_issued
update pointer
continue write_loop;
exit on completion and request and free local buffer.

The price of the overhead and the direct flakyness of the driver we are
running from is returning, so the alternative is to disable MULTI-Sector
Operations.

Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-22 07:20:58

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21, 2002 at 03:53:20PM -0800, Andre Hedrick wrote:
> On Mon, 21 Jan 2002, Vojtech Pavlik wrote:
>
> > On Mon, Jan 21, 2002 at 12:18:21PM -0800, Andre Hedrick wrote:
> >
> > > > > Again, the HOST(Linux) is not following the device side rules so expect
> > > > > difficulty when we depart. The Brain Damage is how to talk to the
> > > > > hardware, and it is clear we are not doing it right because we are bending
> > > > > the rules stuff it into and API that not acceptable. However we are
> > > > > stuck. Again, simplicity works, generate a MEMPOOL for PIO such that the
> > > > > buffer pages are contigious and the 4k page dance is a NOOP. Until that
> > > > > time we will be fussing about.
> > > >
> > > > Andre,
> > > >
> > > > Do you know how to say "I was wrong"? You are walking off-track again.
> > > > It's clearly the way that Vojtech and I describe, otherwise current code
> > > > would just not work. And 2.4, 2.2, 2.0 neither.
> > >
> > > I will and have done so in the past when I am, and it would be nice if you
> > > and Linus could do the same. However since both are going to enforce the
> > > partial completion of IO on page boundaries or 4k, and you are not
> > > allowed to pause or stop in the middle of a command execution to play
> > > memory games under ATA/IDE PIO rules, period.
> >
> > Maybe I'm again totally off-the-track, but I see no reason why I
> > couldn't stop in the middle of a PIO transfer (that is anytime, not even
> > on a sector boundary), do whatever I wish, like change the destination
> > buffer or whatever, and then continue. Sure, I can't send ANY commands
> > to the drive, and reading the status might not be a good idea either,
> > but I believe I can do anything else on the system. Is there a reason
> > why this shouldn't be possible?
>
> Okay if the execution of the command block is ATOMIC, and we want to stop
> an ATOMIC operation to go alter buffers?

YES! I think you got it! Because atomic here doesn't mean 'do it all as
soon as possible with no delay', but 'do nothing else on the ATA bus
inbetween'.

> But that is not the real question. The real question is do we stop
> and ATOMIC process to return data of a partial completeion, and then
> return to a HALTED ATOMIC and hope it will still go?

Yes, and we can do this, and there is no reason why this should not
work.

> DEAD Method:
> issue atomic write 255 sectors
> write 8 sector or 4k or 1 page of memory
>
> interrupt_issued
> exit atomic write
> update top layer buffers
> return;
> continue write_loop;
> exit on completion and update remainder.
>
> BASTARDIZED Method:
> issue write 255 sectors
> truncate to max of 8 sectors
> issue atomic write 8 sectors
> interrupt_issued
> end request and notify 4k page complete
> make new request and merge and repeat.
> note there is a new memcpy fo new request. (max 16 to completion)
>
>
> OLD Method, with Request Page Walking:
> issue atomic write 255 sectors
> write sectors
> interrupt_issued
> walk copy of request
> continue write_loop;
> exit on completion and request and free local buffer.
>
> CORRECT Method:
> collect contigious physical buffer of 255 sectors
> memcpy_to_local (one memcpy)
> issue atomic write 255 sectors
> write sectors
> interrupt_issued
> update pointer
> continue write_loop;
> exit on completion and request and free local buffer.
>
> The price of the overhead and the direct flakyness of the driver we are
> running from is returning, so the alternative is to disable MULTI-Sector
> Operations.

That's pretty much nonsense, beg my pardon. The real correct way would
be:

issue read of 255 sectors using READ_MULTI, max_mult = 16
receive interrupt
inw() first 4k to buffer A
inw() second 4k to buffer B
don't do anything else until the next interrupt

There is absolutely no need for an intermediate scratch buffer, you can
put the inw()ed data anywhere you like, and if you need any post
processing, you can do it as well, at any time.

--
Vojtech Pavlik
SuSE Labs

2002-01-22 07:41:09

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Andre Hedrick wrote:
> 255 * 512bytes != 128K BUG
> 256 * 512bytes == 128K
>
> You insure we will fail on alignemnt.
>
> You have stated BLOCK can not deal with correct sector alignments, and
> thus 255 so please fix it first. I have accepted this brokeness in BLOCK
> and dropped to 128 sectors or a clean 64k.

What statement? The block layer gives you the allignment that _you_
need. Heck, all you have to do is make a simple api call and define your
rules. Nobody is trying to cram anything down your throat. 255 is just
the kernel default, nothing more.

> If we restrict multi-sector PIO to 8 sectors we can do multi interrupt
> ATOMIC disk IO on the paging alignments, but you have enforced single
> sector IO in the multi-sector writing and can not see the difference.

False, I've enforced current_nr_sectors transfers in multi-sector write.
But Andre please understand that changes like this are never final, feel
free to alter it and make it work any other way. Fact is, we needed a
change that made pio and multi mode _work_ right now -- my change does
that, and it's not all that bad.

> If rq->current_nr_sectors is less than 8 we do PIO single sector IO, but
> we are doing that now w/ the copy paste changes from the old ide-disk.c
> stuff that we are attempting deleting.

Well I'm sorry for making changes to your isr's, but they were broken.

> You are making mistakes left and right because you think you understand
> the hardware. I thought we had an agreement, BLOCK stops at DO_REQUEST.
> Now you are altering the driver core, and the ISR's. BLOCK has no
> business in dictating how to talk to the hardware, especially since it
> violates the specification willfully and without need.

?! I can't do anything but block stuff now, what a pity. Please tell me
where the block layer is dictating what you must do.

> We do a DMA of two PRD's of 128 sectors and 127 sectors, thus a mess.

I can add a

BUG_ON(rq->nr_sectors == 255);

for you if you want, that will not happen.

> So at this point pull it and put back the munge for before and I will fix
> it completely and return a turn-key, now that I understand the brokeness
> of the interface I am deal w/ on both sides.

What is broken?

> Until you understand the execution of the command block is ATOMIC it will
> never work. Also when the SCSI-MID Layer is deleted, you will have a
> repeat of this issue on a much grander scale. Eric was a brilliant to
> hide the nature of the transport layer in the SCSI-MID Layer and return
> back partial completion against his ATOMIC Command IO calls.

Oh like the partial completion I added in ide_end_request?

Please calm down Andre, lets not start another round of flames.

--
Jens Axboe

2002-01-22 07:59:22

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Andre Hedrick wrote:
> In PIO there is no scatter gather possible without a memcpy to a
> contigious buffer period. Therefore under the contstraints issued bu

Why?

> Linus and Jens, of access to one 4k page of memory, and a forced
> requirement to return back every 4k page of memory of completion prevents
> one from ever transaction more than 8 sectors per request in PIO any mode.

You don't understand... It's not forced, it's just _the sane way to do
it_. When you finish I/O on a chunk of data, end I/O on that chunk of
data. This doesn definitely _not_ prevent transaction of more than 8
sectors per request, that's nonsense. It's only that way in the current
kernel because it was easy to get right the first time around. And it's
only in multi-write, oh look at multi-read, that does 16 sectors at the
time. Weee!

> start_request_sectors (255 sectors) max
>
> make_request (start_request_sectors())
>
> do_request()
> ide-disk get (255 sectors)
> block truncates to 8 sectors max
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Wrong

> ide-taskfile
> transfers 8 sectors max
^^^^^^^^^^^^^^^^^^^^^^^

Wrong

> end request (return 247 sectors)
>
> upate_request(247 to be re issued, + additional max of 8)

end_request _is_ the update request. You seem to not understand that
calling ide_end_request does not mean that we are terminating the
request from the host side, we are merely asking the block layer to
complete xxx amount of sectors for us so we can continue doing the
request residual.

> make_request (247 to be re issued, + additional max of 8)

Very wrong, make_request is never called here. Let me out line what
happens. Do you really mean start_request, if so then yes.

> This is why I am going to request for backing out again because the BLOCK
> API without a MID-LAYER to buffer against the goal of the kernel,
> conflicts with the hardware rules requirements. Until a satisfactory

end_that_request_first understands partial completion of any size in
2.5, what more of a mid layer do you want?

> agreement can be reached then the current direction it is going will trash
> the Virtual DMA hardware coming in the future.

Is that so?

--
Jens Axboe

2002-01-22 08:12:34

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Tue, 22 Jan 2002, Vojtech Pavlik wrote:

> On Mon, Jan 21, 2002 at 03:53:20PM -0800, Andre Hedrick wrote:
> > On Mon, 21 Jan 2002, Vojtech Pavlik wrote:
> > Okay if the execution of the command block is ATOMIC, and we want to stop
> > an ATOMIC operation to go alter buffers?
>
> YES! I think you got it! Because atomic here doesn't mean 'do it all as
> soon as possible with no delay', but 'do nothing else on the ATA bus
> inbetween'.

In order to do this you can not issue a sector request larger than an
addressable buffer, since the request walking of the rq->buffer is not
allowed.

> > But that is not the real question. The real question is do we stop
> > and ATOMIC process to return data of a partial completeion, and then
> > return to a HALTED ATOMIC and hope it will still go?
>
> Yes, and we can do this, and there is no reason why this should not
> work.

PONDERING ....

> > DEAD Method:
> > issue atomic write 255 sectors
> > write 8 sector or 4k or 1 page of memory
> >
> > interrupt_issued
> > exit atomic write
> > update top layer buffers
> > return;
> > continue write_loop;
> > exit on completion and update remainder.
> >
> > BASTARDIZED Method:
> > issue write 255 sectors
> > truncate to max of 8 sectors
> > issue atomic write 8 sectors
> > interrupt_issued
> > end request and notify 4k page complete
> > make new request and merge and repeat.
> > note there is a new memcpy fo new request. (max 16 to completion)
> >
> >
> > OLD Method, with Request Page Walking:
> > issue atomic write 255 sectors
> > write sectors
> > interrupt_issued
> > walk copy of request
> > continue write_loop;
> > exit on completion and request and free local buffer.
> >
> > CORRECT Method:
> > collect contigious physical buffer of 255 sectors
> > memcpy_to_local (one memcpy)
> > issue atomic write 255 sectors
> > write sectors
> > interrupt_issued
> > update pointer
> > continue write_loop;
> > exit on completion and request and free local buffer.
> >
> > The price of the overhead and the direct flakyness of the driver we are
> > running from is returning, so the alternative is to disable MULTI-Sector
> > Operations.
>
> That's pretty much nonsense, beg my pardon. The real correct way would
> be:

We agreed upon error is the "memcpy_to_local" and "free local buffer".

If a low-memory contigious physical buffers was allocated and all the
bounce_memory locations were copied there before submition to the lower
levels, the drives could run down the buffer and be done, preserve the
entire request for error handling when a write to media fails.

> issue read of 255 sectors using READ_MULTI, max_mult = 16
> receive interrupt
> inw() first 4k to buffer A
> inw() second 4k to buffer B
> don't do anything else until the next interrupt

The second buffer has been taken away, so this is not possible.

> There is absolutely no need for an intermediate scratch buffer, you can
> put the inw()ed data anywhere you like, and if you need any post
> processing, you can do it as well, at any time.

Explain where buffer B come from, because I am totally lost on the above.

I am totally worn out and do not care about the issue anymore for now.

Regards

--a

2002-01-22 08:17:25

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Mon, Jan 21 2002, Andre Hedrick wrote:
> > On Mon, Jan 21, 2002 at 03:53:20PM -0800, Andre Hedrick wrote:
> > > On Mon, 21 Jan 2002, Vojtech Pavlik wrote:
> > > Okay if the execution of the command block is ATOMIC, and we want to stop
> > > an ATOMIC operation to go alter buffers?
> >
> > YES! I think you got it! Because atomic here doesn't mean 'do it all as
> > soon as possible with no delay', but 'do nothing else on the ATA bus
> > inbetween'.
>
> In order to do this you can not issue a sector request larger than an
> addressable buffer, since the request walking of the rq->buffer is not
> allowed.

It's not that it's not allowed, it's that it doesn't work the way you
want it. ->buffer is just the first segment, which is 8 sectors max,
that much is correct. But nothing prevents your from ending the front
of the request and continuing and the drive will never know. Just see
task_mulin_intr.

--
Jens Axboe

2002-01-22 09:12:54

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1


Last volly for a while ...

On Tue, 22 Jan 2002, Jens Axboe wrote:

> On Mon, Jan 21 2002, Andre Hedrick wrote:
> > In PIO there is no scatter gather possible without a memcpy to a
> > contigious buffer period. Therefore under the contstraints issued bu
>
> Why?

Okay provide one and a way to referrence from start point and encapsulated
around segments.

> > Linus and Jens, of access to one 4k page of memory, and a forced
> > requirement to return back every 4k page of memory of completion prevents
> > one from ever transaction more than 8 sectors per request in PIO any mode.
>
> You don't understand... It's not forced, it's just _the sane way to do
> it_. When you finish I/O on a chunk of data, end I/O on that chunk of
> data. This doesn definitely _not_ prevent transaction of more than 8
> sectors per request, that's nonsense. It's only that way in the current
> kernel because it was easy to get right the first time around. And it's
> only in multi-write, oh look at multi-read, that does 16 sectors at the
> time. Weee!

We really have a disconnect.

Everytime command register is written to the drive see a new requst.

You are calling this a segment ??

> > start_request_sectors (255 sectors) max
> >
> > make_request (start_request_sectors())
> >
> > do_request()
> > ide-disk get (255 sectors)
> > block truncates to 8 sectors max
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Wrong

Exclude DMAing.

When the sector_count register is set to a value smaller than the total
request entering, is that not truncating the IO?

> > ide-taskfile
> > transfers 8 sectors max
> ^^^^^^^^^^^^^^^^^^^^^^^
>
> Wrong

How, the constaints are on 4k boundaries or a single page.
4k == 8-512byte sectors.

> > end request (return 247 sectors)
> >
> > upate_request(247 to be re issued, + additional max of 8)
>
> end_request _is_ the update request. You seem to not understand that
> calling ide_end_request does not mean that we are terminating the
> request from the host side, we are merely asking the block layer to
> complete xxx amount of sectors for us so we can continue doing the
> request residual.

Okay, I see there is a lot more code in block to decode.
Because the assumption of the ISR's submitted is for exclusive ownership
of the request and buffer from head to tail for the duration of the
total request. This asumption is similar to DMA executing and total
ownership until completing. This may be a folly wrt to interfacing to the
block layer.

> > make_request (247 to be re issued, + additional max of 8)
>
> Very wrong, make_request is never called here. Let me out line what
> happens. Do you really mean start_request, if so then yes.

Wait, if the original request requires multiple executions of the command
block then they are separate requests.

> > This is why I am going to request for backing out again because the BLOCK
> > API without a MID-LAYER to buffer against the goal of the kernel,
> > conflicts with the hardware rules requirements. Until a satisfactory
>
> end_that_request_first understands partial completion of any size in
> 2.5, what more of a mid layer do you want?

One which interfaces to the hardware exactly, but it appears not possible.
One having the size of the total request and exclusive ownership be
mappable for the duration of the IO.

There are issues associated w/ the barrier down block and how to notify.
Thus owning the entire buffer exclusive for a flush cache barrier write is
needed and may be required. Unless there is a way to notify based on the
LBA physical sector or "block" of the beginning of the error reported,
problems will arrise in journaling filesystems. The rational for
maintaining ownership of an (attempted) ordered write and down block, the
hardware will not guarentee the data beyond the error returned from the
flush-cache command.

> > agreement can be reached then the current direction it is going will trash
> > the Virtual DMA hardware coming in the future.
>
> Is that so?

Yes but we can cross that path when I release the MMIO driver(s) and core.

So I will wander off and think some more, because I see your reference
frame now, but I wonder if you see mine.

Regards,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-22 09:52:37

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1


A CLUE HAS ARRIVED ...

On Tue, 22 Jan 2002, Jens Axboe wrote:

> On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > On Mon, Jan 21, 2002 at 03:53:20PM -0800, Andre Hedrick wrote:
> > > > On Mon, 21 Jan 2002, Vojtech Pavlik wrote:
> > > > Okay if the execution of the command block is ATOMIC, and we want to stop
> > > > an ATOMIC operation to go alter buffers?
> > >
> > > YES! I think you got it! Because atomic here doesn't mean 'do it all as
> > > soon as possible with no delay', but 'do nothing else on the ATA bus
> > > inbetween'.
> >
> > In order to do this you can not issue a sector request larger than an
> > addressable buffer, since the request walking of the rq->buffer is not
> > allowed.
>
> It's not that it's not allowed, it's that it doesn't work the way you
> want it. ->buffer is just the first segment, which is 8 sectors max,
> that much is correct. But nothing prevents your from ending the front
> of the request and continuing and the drive will never know. Just see
> task_mulin_intr.

Is this not the effect of stopping the actual IO?
Then you have to issue another ACB to restart the IO for the next segment?
The device has to know when to stop sending.

ERM, the pain is sinking in .....

It may be possible to do this is paging requirement if on a READ(any pio),
reset or update the rq->buffer prior to reading from the data register.
Now what guarentee will the driver have if a the buffer being a full 8
sectors before the first read, and if that is not enough for the complete
segment transaction, then if we reduce the expected transfers size between
interrupts, it will allow for larger values to be put into the
sector_count register. This reduction must correspond to the expected and
required 4k page.

This I can do, and we can move forward.

If the update of the rq->buffer occurrs afterwards, we may face a
driver--device race w/ an early and missied interrupt asserted.

This sounds like what "Davide Libenzi" is reporting.
Not really a losted, but arrived while the rq->buffer is being updated.
Thus ordering of events are wrong.

forced to set max_multi_sector to page size.

cmd->(buffer must be attached)->isr(statDRQ|BSY,read2buffer)->reload_isr()
isr_n(get_new_buffer,statDRQ|BSY,read2buffer)->reload_isr()

if we relaunch because of statDRQ|BSY is not correct, we need to know the
new buffer is loaded.

writing becomes more interesting.

I have to overlay lay Linux on to the state diagrams and then redraft.

Cheers,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development


2002-01-22 10:07:07

by Jens Axboe

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Tue, Jan 22 2002, Andre Hedrick wrote:
>
> A CLUE HAS ARRIVED ...
>
> On Tue, 22 Jan 2002, Jens Axboe wrote:
>
> > On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > > On Mon, Jan 21, 2002 at 03:53:20PM -0800, Andre Hedrick wrote:
> > > > > On Mon, 21 Jan 2002, Vojtech Pavlik wrote:
> > > > > Okay if the execution of the command block is ATOMIC, and we want to stop
> > > > > an ATOMIC operation to go alter buffers?
> > > >
> > > > YES! I think you got it! Because atomic here doesn't mean 'do it all as
> > > > soon as possible with no delay', but 'do nothing else on the ATA bus
> > > > inbetween'.
> > >
> > > In order to do this you can not issue a sector request larger than an
> > > addressable buffer, since the request walking of the rq->buffer is not
> > > allowed.
> >
> > It's not that it's not allowed, it's that it doesn't work the way you
> > want it. ->buffer is just the first segment, which is 8 sectors max,
> > that much is correct. But nothing prevents your from ending the front
> > of the request and continuing and the drive will never know. Just see
> > task_mulin_intr.
>
> Is this not the effect of stopping the actual IO?

No, not at all. It goes something like this (for multi read, the case
discussed here). Settings for this sample-run are:

- multi mode set to 16 sectors
- request: nr_sectors 24 sectors, current_nr_sectors 8. request is thus
split in 3 parts, we need to partially complete it do finish it.

o ide_do_request, get new active request
o start_request, hand off to ide-disk:do_rw_disk()
o do_rw_disk: setup taskfile, arm interrupt handler, return

[interrupt triggers]

o status is good, we can transfer the 16 sectors the drive expects

o taskfile_input_data for 8 sectors:

nsect = rq->current_nr_sectors;
if (nsect > msect)
nsect = msect;

o call ide_end_request to indicate completion of these 8 sectors.
o calls end_that_request_last to complete the first buffer head
in the request, resetup request for next transfer.

o ide_end_request returns 1, request is not complete.

o taskfile_input_data for 8 sectors.

o call ide_end_request again, still returns 1 (now we have 8 sectors
left in the request)

o now we have transferred the 16 sectors inside the interrupt handler,
since request is not complete rearm interrupt handler and return.

Next time task_mulin_intr is fired, we do the remaining 8 sectors. This
time the drive knows to expect only 8 sectors, since we originally
programmed it for 24 sectors total for this request.

> Then you have to issue another ACB to restart the IO for the next segment?
> The device has to know when to stop sending.

Nope, see the above.

> It may be possible to do this is paging requirement if on a READ(any pio),
> reset or update the rq->buffer prior to reading from the data register.

Yes that's very important, the ordering must be right or we are screwed.

> Now what guarentee will the driver have if a the buffer being a full 8
> sectors before the first read, and if that is not enough for the complete
> segment transaction, then if we reduce the expected transfers size between
> interrupts, it will allow for larger values to be put into the
> sector_count register. This reduction must correspond to the expected and
> required 4k page.

But why? The above scenario works.

> This I can do, and we can move forward.
>
> If the update of the rq->buffer occurrs afterwards, we may face a
> driver--device race w/ an early and missied interrupt asserted.

We don't care about rq->buffer at all. What is important is correct (and
ordered) rq->current_nr_sectors updates so that ide_map_rq returns the
right transfer location.

> This sounds like what "Davide Libenzi" is reporting.
> Not really a losted, but arrived while the rq->buffer is being updated.
> Thus ordering of events are wrong.

It very well could be.

--
Jens Axboe

2002-01-22 10:22:07

by Denis Vlasenko

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

Whee, an IDE flamewar! :-)

People, can we get colder? Let's clarify positions without generating useless
heat, ok?


1. Re multi-sector reads/writes:

On 21 January 2002 20:45, Petr Vandrovec wrote:
> If the number of requested sectors is not evenly divisible by the block
> count, as many full blocks as possible are transferred, followed by a
> final, partial block transfer. The partial block transfer shall be for n
> sectors, where n = remainder (sector count/block count).
>
> And almost identical text appears on page 296, where it talks about
> WRITE MULTIPLE.
>
> If you are trying to persuade us that there are devices which support
> ATA interface, and do not follow these paragraphs word by word, there
> is certainly something wrong in the ATA world...

Seems logical to me too. Imagine we have told drive to use 16 sector multi
mode. Now we are trying to read 24 sectors (6 pages of 4k each):
CPU IDE
------------- ---------------
read_multiple(sect cnt=24) ->
*reading 16 sectors*
<- interrupt
give me data ->
(asm: 'rep insw')
<- <- <-16bit words with data
<- <- <- (total: 16 sectors)
*reading 8 sectors*
<- interrupt
give me data ->
<- <- <-16bit words with data
<- <- <- (total: 8 sectors)

Andre, do you think that it is _not_ ok to do multi-sector read/write ops
with sector count non-divisible by programmed multisector count?
Do you have or know of some existing drive which misbehaves? Do you think
such drive will appear in future?


2. Re cotiguous buffer for large PIO blocks:

On 21 January 2002 21:53, Andre Hedrick wrote:
> OLD Method, with Request Page Walking:
> issue atomic write 255 sectors
> write sectors
> interrupt_issued
> walk copy of request
> continue write_loop;
> exit on completion and request and free local buffer.
>
> CORRECT Method:
> collect contigious physical buffer of 255 sectors
> memcpy_to_local (one memcpy)
> issue atomic write 255 sectors
> write sectors
> interrupt_issued
> update pointer
> continue write_loop;
> exit on completion and request and free local buffer.

Do I understand OLD method correctly? Example: reading 128 sectors
in one transfer (assuming drive can do 128 multisector PIO):

void* page[16]; /* holds addresses of target 4k pages */
...
/* in interrupt handler: get data from IDE in PIO mode */
i=0;
while(i<16) {
rep_insw(4096/2, page[i]);
/* rep_insw() in i386 pseudo-asm:
dx=ioport; ecx=4096/2; edi=page[i]; cld; rep insw
*/
i++;
}
...

I don't see flaws here, IDE will never notice that buffer is non-contiguous
(except for tiny delay between insw's while i++ and i<16 get executed).
Andre, can you explain what's wrong here and why you think we need CORRECT
method?
--
vda

2002-01-22 10:27:17

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

Hi Jens,

A quick question which is I think what Andre is really concerned about when
talking about atomicity... or if he is not then I would be. (-;

At 10:06 22/01/02, Jens Axboe wrote:
>On Tue, Jan 22 2002, Andre Hedrick wrote:
> >
> > A CLUE HAS ARRIVED ...
> >
> > On Tue, 22 Jan 2002, Jens Axboe wrote:
> >
> > > On Mon, Jan 21 2002, Andre Hedrick wrote:
> > > > > On Mon, Jan 21, 2002 at 03:53:20PM -0800, Andre Hedrick wrote:
> > > > > > On Mon, 21 Jan 2002, Vojtech Pavlik wrote:
> > > > > > Okay if the execution of the command block is ATOMIC, and we
> want to stop
> > > > > > an ATOMIC operation to go alter buffers?
> > > > >
> > > > > YES! I think you got it! Because atomic here doesn't mean 'do it
> all as
> > > > > soon as possible with no delay', but 'do nothing else on the ATA bus
> > > > > inbetween'.
> > > >
> > > > In order to do this you can not issue a sector request larger than an
> > > > addressable buffer, since the request walking of the rq->buffer is not
> > > > allowed.
> > >
> > > It's not that it's not allowed, it's that it doesn't work the way you
> > > want it. ->buffer is just the first segment, which is 8 sectors max,
> > > that much is correct. But nothing prevents your from ending the front
> > > of the request and continuing and the drive will never know. Just see
> > > task_mulin_intr.
> >
> > Is this not the effect of stopping the actual IO?
>
>No, not at all. It goes something like this (for multi read, the case
>discussed here). Settings for this sample-run are:
>
>- multi mode set to 16 sectors
>- request: nr_sectors 24 sectors, current_nr_sectors 8. request is thus
> split in 3 parts, we need to partially complete it do finish it.
>
>o ide_do_request, get new active request
>o start_request, hand off to ide-disk:do_rw_disk()
>o do_rw_disk: setup taskfile, arm interrupt handler, return
>
>[interrupt triggers]
>
>o status is good, we can transfer the 16 sectors the drive expects

Is it possible that the request is aborted at any point between here...

>o taskfile_input_data for 8 sectors:
>
> nsect = rq->current_nr_sectors;
> if (nsect > msect)
> nsect = msect;

>o call ide_end_request to indicate completion of these 8 sectors.
> o calls end_that_request_last to complete the first buffer head
> in the request, resetup request for next transfer.
>
>o ide_end_request returns 1, request is not complete.

... and ide_end_request returning 1 here so that ide_end_request would in
fact not return 1?

If not, then there is no problem. The operation is atomic, it's just a
switch from one destination page to another (taking this particular example
and 4k page size), whether the switch happens fast enough is a different
cattle of fish altogether...

If yes, I see where Andre is complaining: an abort at this position would
leave the drive in "io in flight" state and you get "lost interrupt" and
possibly you all goes pear shaped the first time the next command goes to
the drive (unless it happens to be the appropriate reset command).

I hope that makes any sense?

Best regards,

Anton

>o taskfile_input_data for 8 sectors.
>
>o call ide_end_request again, still returns 1 (now we have 8 sectors
> left in the request)
>
>o now we have transferred the 16 sectors inside the interrupt handler,
> since request is not complete rearm interrupt handler and return.
>
>Next time task_mulin_intr is fired, we do the remaining 8 sectors. This
>time the drive knows to expect only 8 sectors, since we originally
>programmed it for 24 sectors total for this request.
>
> > Then you have to issue another ACB to restart the IO for the next segment?
> > The device has to know when to stop sending.
>
>Nope, see the above.
>
> > It may be possible to do this is paging requirement if on a READ(any pio),
> > reset or update the rq->buffer prior to reading from the data register.
>
>Yes that's very important, the ordering must be right or we are screwed.
>
> > Now what guarentee will the driver have if a the buffer being a full 8
> > sectors before the first read, and if that is not enough for the complete
> > segment transaction, then if we reduce the expected transfers size between
> > interrupts, it will allow for larger values to be put into the
> > sector_count register. This reduction must correspond to the expected and
> > required 4k page.
>
>But why? The above scenario works.
>
> > This I can do, and we can move forward.
> >
> > If the update of the rq->buffer occurrs afterwards, we may face a
> > driver--device race w/ an early and missied interrupt asserted.
>
>We don't care about rq->buffer at all. What is important is correct (and
>ordered) rq->current_nr_sectors updates so that ide_map_rq returns the
>right transfer location.
>
> > This sounds like what "Davide Libenzi" is reporting.
> > Not really a losted, but arrived while the rq->buffer is being updated.
> > Thus ordering of events are wrong.
>
>It very well could be.
>
>--
>Jens Axboe

--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/

2002-01-22 16:51:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1


On Tue, 22 Jan 2002, Vojtech Pavlik wrote:
>
> That's pretty much nonsense, beg my pardon. The real correct way would
> be:
>
> issue read of 255 sectors using READ_MULTI, max_mult = 16
> receive interrupt
> inw() first 4k to buffer A
> inw() second 4k to buffer B
> don't do anything else until the next interrupt

Definitely.

There is no way the controller can even _know_ the difference between

- one large 8kB "rep insw" instruction
- two (or more) smaller chunks of "rep insw" adding up to 8kB worth of
"inw"

as long as there are no other IO instructions to that controller in
between. The two look _exactly_ the same on the bus - there aren't even
any bursting issues (you can only burst on MMIO, not PIO accesses).

Sure, there are some timing issues, but (a) data cache misses are much
bigger things than just a few instructions, and (b) we allow interrupts
from other devices anyway, so the timing _really_ isn't even an issue.

So just call "ata_input_data()" several times in a loop for discontinuous
buffers. I told Andre this before.

Linus

2002-01-22 18:55:45

by Andre Hedrick

[permalink] [raw]
Subject: Re: Linux 2.5.3-pre1-aia1

On Tue, 22 Jan 2002, Linus Torvalds wrote:

>
> On Tue, 22 Jan 2002, Vojtech Pavlik wrote:
> >
> > That's pretty much nonsense, beg my pardon. The real correct way would
> > be:
> >
> > issue read of 255 sectors using READ_MULTI, max_mult = 16
> > receive interrupt
> > inw() first 4k to buffer A
> > inw() second 4k to buffer B
> > don't do anything else until the next interrupt
>
> Definitely.
>
> There is no way the controller can even _know_ the difference between
>
> - one large 8kB "rep insw" instruction
> - two (or more) smaller chunks of "rep insw" adding up to 8kB worth of
> "inw"
>
> as long as there are no other IO instructions to that controller in
> between. The two look _exactly_ the same on the bus - there aren't even
> any bursting issues (you can only burst on MMIO, not PIO accesses).
>
> Sure, there are some timing issues, but (a) data cache misses are much
> bigger things than just a few instructions, and (b) we allow interrupts
> from other devices anyway, so the timing _really_ isn't even an issue.
>
> So just call "ata_input_data()" several times in a loop for discontinuous
> buffers. I told Andre this before.

Linus,

Then do you mind if we add a page alignment (on the page) to the buffer
based on the rq->current_nr_sectors and to insure a complete page of all
4k of data is present? Only because during the transition between
these two states HPIOO0:HPIOO1 is data permitted. If you touch the data
register to write or read and by chance (almost always in Linux) you do
not have a enough data to complete the HPIOO1, the ide_end_request stops
the data process. It is only proper to to reject unaligned pages when it
is known to invoke an cause HOST:DEVICE pair problems.

Since the only way to insure the correct amount data is present and not
risk/create a device driver race, is to reject unaligned pages.

So understand you (Linus) clearly and no mistakes are made in translation,
you want, approve, and specify data transfers on on unaligned pages. You
require the data-phase at HPIOO1(trasnfer data) to leave and go hunt for
more unaligned pages to complete the transaction. There is a problem
because nowhere can I find a transition point go find more data.

Some how you have gotten it in you head it is legal and correct to issue
partial data blocks, and has thrown me for a loop.

Respectfully,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-22 23:38:26

by Andre Hedrick

[permalink] [raw]
Subject: END GAME (Re: Linux 2.5.3-pre1-aia1)


Linus, Jens,

I need a function that performs the kmapping to return a pointer with all
the data needed for that transaction of the data phase, and will cross
pages correctly, and may cross more than 2 pages at a time in PIO.
I do not care how you do.

char * majic_voodoo_mapping (
struct request *rq,
int nsect,
unsigned long *flags)
{
char * buffer_walk = ide_map_rq(rq, &flags);
nsect -= ide_rq_offset(rq);
do {
buffer_walk += get_some_more(rq, nsect);
} while (nsect)
return buffer_walk;
}

This should solve all the problems in the data-phases and let the driver
run correctly. The result is on each "get_some_more" will all BLOCK/BIO to
return the partial competions of at least one page

The function would behave like ide_end_request but only to adjust the
buffer in process, and make block/bio deal with munging it back to the top
layers on the partial completions, it will not stop the data IO process of
the ATOMIC command in process.

There is a possible place in while hanging on the HPIOO1 T-BAR to safely
leave and collect "ALL" of the buffer to be put down once we start IO'ing.
If it return with only a partial, YOU WILL HANG THE DRIVE, with one
acception the REMAINDER of the BLOCK_DATA transaction.

You get your early partial complete, and a correct running driver.
Better yet, I will shut my PIEHOLE!

Cheers,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development

2002-01-23 08:55:47

by Jens Axboe

[permalink] [raw]
Subject: Re: END GAME (Re: Linux 2.5.3-pre1-aia1)

On Tue, Jan 22 2002, Andre Hedrick wrote:
>
> Linus, Jens,
>
> I need a function that performs the kmapping to return a pointer with all
> the data needed for that transaction of the data phase, and will cross
> pages correctly, and may cross more than 2 pages at a time in PIO.
> I do not care how you do.
>
> char * majic_voodoo_mapping (
> struct request *rq,
> int nsect,
> unsigned long *flags)
> {
> char * buffer_walk = ide_map_rq(rq, &flags);
> nsect -= ide_rq_offset(rq);
> do {
> buffer_walk += get_some_more(rq, nsect);
> } while (nsect)
> return buffer_walk;
> }
>
> This should solve all the problems in the data-phases and let the driver
> run correctly. The result is on each "get_some_more" will all BLOCK/BIO to
> return the partial competions of at least one page
>
> The function would behave like ide_end_request but only to adjust the
> buffer in process, and make block/bio deal with munging it back to the top
> layers on the partial completions, it will not stop the data IO process of
> the ATOMIC command in process.

That _is_ what ide_end_request is doing, it is not stopping any data I/O
in progress. It is also atomic. What exactly do you think is missing?

> Better yet, I will shut my PIEHOLE!

Oh, that'll be the day. Lets just say I'm not running to get my
calendar and the big fat marker. :-)

--
Jens Axboe

2002-01-23 21:05:38

by Andre Hedrick

[permalink] [raw]
Subject: Re: END GAME (Re: Linux 2.5.3-pre1-aia1)

On Wed, 23 Jan 2002, Jens Axboe wrote:

> On Tue, Jan 22 2002, Andre Hedrick wrote:
> >
> > Linus, Jens,
> >
> > I need a function that performs the kmapping to return a pointer with all
> > the data needed for that transaction of the data phase, and will cross
> > pages correctly, and may cross more than 2 pages at a time in PIO.
> > I do not care how you do.
> >
> > char * majic_voodoo_mapping (
> > struct request *rq,
> > int nsect,
> > unsigned long *flags)
> > {
> > char * buffer_walk = ide_map_rq(rq, &flags);
> > nsect -= ide_rq_offset(rq);
> > do {
> > buffer_walk += get_some_more(rq, nsect);
> > } while (nsect)
> > return buffer_walk;
> > }

When I chatted w/ Anton, he suggested a char ** to allow page walking.
This is more in line for allowing a rq->bio_list walking.

Obviously I may not have the correct item but the idea should be clean.

> > This should solve all the problems in the data-phases and let the driver
> > run correctly. The result is on each "get_some_more" will all BLOCK/BIO to
> > return the partial competions of at least one page
> >
> > The function would behave like ide_end_request but only to adjust the
> > buffer in process, and make block/bio deal with munging it back to the top
> > layers on the partial completions, it will not stop the data IO process of
> > the ATOMIC command in process.
>
> That _is_ what ide_end_request is doing, it is not stopping any data I/O
> in progress. It is also atomic. What exactly do you think is missing?

If you are provided with an acceptable solution to make BLOCK/BIO more
flexable to the needs of the hardware, that is a possible answer and
should be acceptable, provided it does not violate other layers ?

> > Better yet, I will shut my PIEHOLE!
>
> Oh, that'll be the day. Lets just say I'm not running to get my
> calendar and the big fat marker. :-)

Don't worry, I will FedEX one to you so you will not have to search.

Cheers,

Andre Hedrick
Linux Disk Certification Project Linux ATA Development