2007-05-30 22:10:40

by Daniel J Blueman

[permalink] [raw]
Subject: Compact Flash performance...

I have a SanDisk Extreme IV 4GB CF card, capable of 40MB/s read, but
am seeing 30MB/s read [1], connected directly to the IDE bus on my
ICH8 controller.

How can I find out if this would be a timing or configuration issue?
On 2.6.20.5 [2], the 120nS timing looks to be right [3], but perhaps
no multi-word transfer is hurting here...alas, it can't be enabled
with the libata subsystem and 'hdparm -m', so what else?

Daniel

--- [1]

# hdparm -t /dev/sdb
/dev/sdb:
Timing buffered disk reads: 94 MB in 3.05 seconds = 30.79 MB/sec

--- [2]

ata7: PATA max UDMA/100 cmd 0x000000000001bc00 ctl 0x000000000001b882
bmdma 0x000000000001b400 irq 17
ata7.00: CFA: SanDisk SDCFX-4096, HDX 4.04, max UDMA/66
ata7.00: 8027712 sectors, multi 0: LBA
ata7.00: configured for UDMA/66
ATA: abnormal status 0x7F on port 0x000000000001b807
scsi 6:0:0:0: Direct-Access ATA SanDisk SDCFX-40 HDX PQ: 0 ANSI: 5
SCSI device sdb: 8027712 512-byte hdwr sectors (4110 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: write cache: disabled, read cache: enabled, doesn't
support DPO or FUA
SCSI device sdb: 8027712 512-byte hdwr sectors (4110 MB)
sdb: sdb1
sd 6:0:0:0: Attached scsi removable disk sdb

--- [3]

# hdparm -I /dev/sdb

/dev/sdb:

CompactFlash ATA device, with removable media
Model Number: SanDisk SDCFX-4096
Serial Number: 116802D2807J3335
Firmware Revision: HDX 4.04
Standards:
Supported: 4
Likely used: 4
Configuration:
Logical max current
cylinders 7964 7964
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 8027712
LBA user addressable sectors: 8027712
device size with M = 1024*1024: 3919 MBytes
device size with M = 1000*1000: 4110 MBytes (4 GB)
Capabilities:
LBA, IORDY(may be)(cannot be disabled)
Standby timer values: spec'd by Vendor
R/W multiple sector transfer: Max = 4 Current = 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 *udma4
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
Write cache
* CFA feature set
--
Daniel J Blueman


2007-05-30 22:31:21

by Lee Revell

[permalink] [raw]
Subject: Re: Compact Flash performance...

On 5/30/07, Daniel J Blueman <[email protected]> wrote:
> I have a SanDisk Extreme IV 4GB CF card, capable of 40MB/s read, but
> am seeing 30MB/s read [1], connected directly to the IDE bus on my
> ICH8 controller.

How do you know it's capable of 40MB/s read?

Lee

Subject: Re: Compact Flash performance...


Hi,

Since you are using libata ata_piix driver and not IDE piix one
Jeff and/or Alan are the right people to ask this question...

Anyway...

On Thursday 31 May 2007, Daniel J Blueman wrote:
> I have a SanDisk Extreme IV 4GB CF card, capable of 40MB/s read, but
> am seeing 30MB/s read [1], connected directly to the IDE bus on my
> ICH8 controller.
>
> How can I find out if this would be a timing or configuration issue?
> On 2.6.20.5 [2], the 120nS timing looks to be right [3], but perhaps

multi-word DMA cycle timing seems to be configured OK

Shouldn't really matter since it is for multi-word DMA transfers
and this device is using UDMA transfers.

> no multi-word transfer is hurting here...alas, it can't be enabled
> with the libata subsystem and 'hdparm -m', so what else?

-m is for multi sector PIO transfers and probably won't help here

Everything (except harmless "abnormal status" garbage) seems fine.

Where does the max 40MB/s come from? Were you able to get this device to
work with this speed using some other controller and/or other OS-es?

Thanks,
Bart

> Daniel
>
> --- [1]
>
> # hdparm -t /dev/sdb
> /dev/sdb:
> Timing buffered disk reads: 94 MB in 3.05 seconds = 30.79 MB/sec
>
> --- [2]
>
> ata7: PATA max UDMA/100 cmd 0x000000000001bc00 ctl 0x000000000001b882
> bmdma 0x000000000001b400 irq 17
> ata7.00: CFA: SanDisk SDCFX-4096, HDX 4.04, max UDMA/66
> ata7.00: 8027712 sectors, multi 0: LBA
> ata7.00: configured for UDMA/66
> ATA: abnormal status 0x7F on port 0x000000000001b807
> scsi 6:0:0:0: Direct-Access ATA SanDisk SDCFX-40 HDX PQ: 0 ANSI: 5
> SCSI device sdb: 8027712 512-byte hdwr sectors (4110 MB)
> sdb: Write Protect is off
> sdb: Mode Sense: 00 3a 00 00
> SCSI device sdb: write cache: disabled, read cache: enabled, doesn't
> support DPO or FUA
> SCSI device sdb: 8027712 512-byte hdwr sectors (4110 MB)
> sdb: sdb1
> sd 6:0:0:0: Attached scsi removable disk sdb
>
> --- [3]
>
> # hdparm -I /dev/sdb
>
> /dev/sdb:
>
> CompactFlash ATA device, with removable media
> Model Number: SanDisk SDCFX-4096
> Serial Number: 116802D2807J3335
> Firmware Revision: HDX 4.04
> Standards:
> Supported: 4
> Likely used: 4
> Configuration:
> Logical max current
> cylinders 7964 7964
> heads 16 16
> sectors/track 63 63
> --
> CHS current addressable sectors: 8027712
> LBA user addressable sectors: 8027712
> device size with M = 1024*1024: 3919 MBytes
> device size with M = 1000*1000: 4110 MBytes (4 GB)
> Capabilities:
> LBA, IORDY(may be)(cannot be disabled)
> Standby timer values: spec'd by Vendor
> R/W multiple sector transfer: Max = 4 Current = 0
> DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 *udma4
> Cycle time: min=120ns recommended=120ns
> PIO: pio0 pio1 pio2 pio3 pio4
> Cycle time: no flow control=120ns IORDY flow control=120ns
> Commands/features:
> Enabled Supported:
> Write cache
> * CFA feature set
> --
> Daniel J Blueman

2007-05-31 03:24:42

by Mark Lord

[permalink] [raw]
Subject: Re: Compact Flash performance...

Daniel J Blueman wrote:
> I have a SanDisk Extreme IV 4GB CF card, capable of 40MB/s read, but
> am seeing 30MB/s read [1], connected directly to the IDE bus on my
> ICH8 controller.
>
> How can I find out if this would be a timing or configuration issue?
> On 2.6.20.5 [2], the 120nS timing looks to be right [3], but perhaps
> no multi-word transfer is hurting here...alas, it can't be enabled
> with the libata subsystem and 'hdparm -m', so what else?

Please post the output from "hdparm --Istdout /dev/sdb" for this card.

Thanks

2007-05-31 09:18:59

by Daniel J Blueman

[permalink] [raw]
Subject: Re: Compact Flash performance...

On 30/05/07, Lee Revell <[email protected]> wrote:
> On 5/30/07, Daniel J Blueman <[email protected]> wrote:
> > I have a SanDisk Extreme IV 4GB CF card, capable of 40MB/s read, but
> > am seeing 30MB/s read [1], connected directly to the IDE bus on my
> > ICH8 controller.
>
> How do you know it's capable of 40MB/s read?

Hi Lee,

There are various reports of users getting 40MB/s == 38.1MiB/s with
UDMA mode 6 media readers (which are few and far between):

http://www.robgalbraith.com/bins/content_page.asp?cid=7-7896-8475
http://www.it-enquirer.com/media/screenshots/sandisk-extreme-bench.png

If a particular reader is able to get 40MB/s through the CF interface,
then this most likely will be achievable with ICH8+Linux also, but
perhaps parameters need to be tuned or the right IDE configuration
needs to be reached etc.

> Lee
--
Daniel J Blueman

2007-05-31 09:22:21

by Daniel J Blueman

[permalink] [raw]
Subject: Re: Compact Flash performance...

Hi Mark,

Thanks for the reply; here is the raw identification data:

# hdparm --Istdout /dev/sdb

/dev/sdb:
045a 3fff c837 0010 0000 0000 003f 0000
0000 0000 2020 2020 2020 5644 5334 3142
5434 4456 3038 474a 0003 3bf5 0034 5634
344f 4139 3641 4844 5437 3232 3532 3544
4c41 3338 3020 2020 2020 2020 2020 2020
2020 2020 2020 2020 2020 2020 2020 8010
0000 2f00 4000 0200 0200 0007 3fff 0010
003f fc10 00fb 0110 ffff 0fff 0000 0007
0003 0078 0078 00f0 0078 0000 0000 0000
0000 0000 0000 001f 0306 0000 005e 0040
00fc 001a 346b 7fe9 4773 3469 3e01 4763
407f 0038 0000 00fe fffe 0000 80fe 0008
00ca 00f9 2710 0000 5970 1d1c 0000 0000
00ca 0000 0000 5a87 5000 cca2 0bd9 ea2d
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0009 000b 0000 0000 3982 0db1 fe20 0001
4000 0004 0000 0000 0000 1df7 28db 131a
0300 0280 3f7f 00c0 0040 2b00 8000 0000
344f 4339 0000 4004 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 7da5

Daniel

On 31/05/07, Mark Lord <[email protected]> wrote:
> Daniel J Blueman wrote:
> > I have a SanDisk Extreme IV 4GB CF card, capable of 40MB/s read, but
> > am seeing 30MB/s read [1], connected directly to the IDE bus on my
> > ICH8 controller.
> >
> > How can I find out if this would be a timing or configuration issue?
> > On 2.6.20.5 [2], the 120nS timing looks to be right [3], but perhaps
> > no multi-word transfer is hurting here...alas, it can't be enabled
> > with the libata subsystem and 'hdparm -m', so what else?
>
> Please post the output from "hdparm --Istdout /dev/sdb" for this card.
>
> Thanks
--
Daniel J Blueman

2007-05-31 12:23:15

by Mark Lord

[permalink] [raw]
Subject: Re: Compact Flash performance...

Daniel J Blueman wrote:
> Hi Mark,
>
> Thanks for the reply; here is the raw identification data:
>
> # hdparm --Istdout /dev/sdb
>
> /dev/sdb:
> 045a 3fff c837 0010 0000 0000 003f 0000
> 0000 0000 2020 2020 2020 5644 5334 3142
> 5434 4456 3038 474a 0003 3bf5 0034 5634
> 344f 4139 3641 4844 5437 3232 3532 3544
> 4c41 3338 3020 2020 2020 2020 2020 2020

ATA device, with non-removable media
Model Number: HDT722525DLA380
Serial Number: VDS41BT4DV08GJ
Firmware Revision: V44OA96A
...
device size with M = 1000*1000: 250059 MBytes (250 GB)
...
* SATA-I signaling speed (1.5Gb/s)
* SATA-II signaling speed (3.0Gb/s)


Ooops.. wrong drive. That's NOT a CF card. Try again?

Thanks

2007-05-31 17:26:04

by Daniel J Blueman

[permalink] [raw]
Subject: Re: Compact Flash performance...

On 31/05/07, Mark Lord <[email protected]> wrote:
> Daniel J Blueman wrote:
> > Hi Mark,
> >
> > Thanks for the reply; here is the raw identification data:
> >
> > # hdparm --Istdout /dev/sdb
> >
> > /dev/sdb:
[snip]
> Ooops.. wrong drive. That's NOT a CF card. Try again?

Whoops, yes. Here is the expected data:

# hdparm --Istdout /dev/sdb

/dev/sdb:
848a 1f1c 0000 0010 0000 0240 003f 007a
7e40 0000 2020 2020 3131 3638 3032 4432
3830 374a 3333 3335 0002 0002 0004 4844
5820 342e 3034 5361 6e44 6973 6b20 5344
4346 582d 3430 3936 2020 2020 2020 2020
2020 2020 2020 2020 2020 2020 2020 0004
0000 0300 0000 0200 0000 0007 1f1c 0010
003f 7e40 007a 0100 7e40 007a 0000 0007
0003 0078 0078 0078 0078 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0010 0000 0020 4004 4000 0000 0004 4000
101f 0000 0000 0000 0000 2000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0082 001b 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
--
Daniel J Blueman

2007-05-31 20:54:49

by Mark Lord

[permalink] [raw]
Subject: Re: Compact Flash performance...

Daniel J Blueman wrote:
>
> Whoops, yes. Here is the expected data:
>
> # hdparm --Istdout /dev/sdb
>
> /dev/sdb:
> 848a 1f1c 0000 0010 0000 0240 003f 007a
> 7e40 0000 2020 2020 3131 3638 3032 4432
> 3830 374a 3333 3335 0002 0002 0004 4844
> 5820 342e 3034 5361 6e44 6973 6b20 5344
> 4346 582d 3430 3936 2020 2020 2020 2020
> 2020 2020 2020 2020 2020 2020 2020 0004
> 0000 0300 0000 0200 0000 0007 1f1c 0010
> 003f 7e40 007a 0100 7e40 007a 0000 0007
> 0003 0078 0078 0078 0078 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0010 0000 0020 4004 4000 0000 0004 4000
> 101f 0000 0000 0000 0000 2000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0082 001b 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000

Thanks. I'll use that data to update/validate future versions of hdparm.
At UDMA66, it *should* be capable of the 40MByte/sec realm of readback perf,
assuming the card itself is really that fast.

I don't know too much about the specifics, though, but perhaps the
card is only capable of full speed in PIO6, which requires special cabling
and is currently unsupported in libata (?).

Another factor, is that hdparm performs discrete, non-overlapping,
reads of 1MByte chunks for its timing test. Some drives cannot achieve
full performance with such (relatively) large gaps between IO's.

Also, just for fun, you could try "hdparm --direct -t /dev/sdb"

Cheers

2007-05-31 21:39:42

by Daniel J Blueman

[permalink] [raw]
Subject: Re: Compact Flash performance...

On 31/05/07, Mark Lord <[email protected]> wrote:
> Daniel J Blueman wrote:
> > Whoops, yes. Here is the expected data:
[snip]
>
> Thanks. I'll use that data to update/validate future versions of hdparm.
> At UDMA66, it *should* be capable of the 40MByte/sec realm of readback perf,
> assuming the card itself is really that fast.

hdparm in the other identify mode does list the UDMA3/4 modes twice
[1], which looks odd.

> I don't know too much about the specifics, though, but perhaps the
> card is only capable of full speed in PIO6, which requires special cabling
> and is currently unsupported in libata (?).

Seems less likely, as the Extreme IV reader (and another) supports
UDMA mode 4; in PIO mode 6, they apparently top out at 17MB/s [2],
which seems reasonable.

> Another factor, is that hdparm performs discrete, non-overlapping,
> reads of 1MByte chunks for its timing test. Some drives cannot achieve
> full performance with such (relatively) large gaps between IO's.

100MB transfers still achieve 32MB/s:

# dd if=/dev/sdb of=/dev/null bs=100000k count=10
10+0 records in
10+0 records out
1024000000 bytes (1.0 GB) copied, 31.9328 seconds, 32.1 MB/s

> Also, just for fun, you could try "hdparm --direct -t /dev/sdb"

# hdparm --direct -t /dev/sdb

/dev/sdb:
Timing O_DIRECT disk reads: 96 MB in 3.05 seconds = 31.47 MB/sec

It is conceivable that the controller in the two particular readers
which get 40MB/s are doing some kind of prefetching, but seems seems
like an extreme gain.

I'll check things out with the IDE PIIX code also. Thanks for your help!
Daniel

> Cheers

--- [1]

# hdparm -i /dev/sdb

/dev/sdb:

Model=SanDisk SDCFX-4096 , FwRev=HDX 4.04,
SerialNo= 116802D2807J3335
Config={ HardSect NotMFM Removeable DTR>10Mbs nonMagnetic }
RawCHS=7964/16/63, TrkSize=0, SectSize=576, ECCbytes=4
BuffType=DualPort, BuffSize=1kB, MaxMultSect=4, MultSect=?0?
CurCHS=7964/16/63, CurSects=8027712, LBA=yes, LBAsects=8027712
IORDY=no, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 *udma4 udma3 *udma4
AdvancedPM=no WriteCache=disabled
Drive conforms to: Unspecified: ATA/ATAPI-4

* signifies the current active mode

--- [2]

http://www.robgalbraith.com/bins/content_page.asp?cid=7-7896-8475
--
Daniel J Blueman

2007-05-31 22:34:11

by Mark Lord

[permalink] [raw]
Subject: Re: Compact Flash performance...

Daniel J Blueman wrote:
> On 31/05/07, Mark Lord <[email protected]> wrote:
>> Daniel J Blueman wrote:
>> > Whoops, yes. Here is the expected data:
> [snip]
>>
>> Thanks. I'll use that data to update/validate future versions of hdparm.
>> At UDMA66, it *should* be capable of the 40MByte/sec realm of readback
>> perf,
>> assuming the card itself is really that fast.
>
> hdparm in the other identify mode does list the UDMA3/4 modes twice
> [1], which looks odd.
>
>> I don't know too much about the specifics, though, but perhaps the
>> card is only capable of full speed in PIO6, which requires special
>> cabling
>> and is currently unsupported in libata (?).
>
> Seems less likely, as the Extreme IV reader (and another) supports
> UDMA mode 4; in PIO mode 6, they apparently top out at 17MB/s [2],
> which seems reasonable.
>
>> Another factor, is that hdparm performs discrete, non-overlapping,
>> reads of 1MByte chunks for its timing test. Some drives cannot achieve
>> full performance with such (relatively) large gaps between IO's.
>
> 100MB transfers still achieve 32MB/s:
>
> # dd if=/dev/sdb of=/dev/null bs=100000k count=10
> 10+0 records in
> 10+0 records out
> 1024000000 bytes (1.0 GB) copied, 31.9328 seconds, 32.1 MB/s
>
>> Also, just for fun, you could try "hdparm --direct -t /dev/sdb"
>
> # hdparm --direct -t /dev/sdb
>
> /dev/sdb:
> Timing O_DIRECT disk reads: 96 MB in 3.05 seconds = 31.47 MB/sec
>
> It is conceivable that the controller in the two particular readers
> which get 40MB/s are doing some kind of prefetching, but seems seems
> like an extreme gain.

Okay, here's the new hdparm information for this:

Capabilities:
...
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 *udma4
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns

Commands/features:
Enabled Supported:
Write cache
* CFA feature set
* CFA advanced modes: pio5 *pio6

So, udma4 and pio6 are the fastest supported speeds.
According to the CFA specifications (v4.1), either of those modes
requires SHORT cables and special handling. You probably have
regular (16-18") cables, and libata doesn't support PIO6,
and the motherboard chipset may not support the "special handling"
requirements in other ways. Also, only one device on the cable.

I see from your earlier posting that libata selected UDMA/66 (udma4)
for the device though, since libata doesn't know that your cable
is too long. And that mode is working, so that's probably as good
as it gets on that particular motherboard chipset.

Some cards may perform better when their "memory" interface is used
instead of the "I/O" interface, or vice-versa. I'm not sure which
of the two methods was selected by libata (probably the "memory" interface).

There is also a "PC-Card" style interface with shared-memory,
which some USB readers *may* use as an alternative to the standard
IDE/ATA style interface.

Cheers

2007-05-31 22:35:37

by Mark Lord

[permalink] [raw]
Subject: Re: Compact Flash performance...

> Daniel J Blueman wrote:
>>
>> hdparm in the other identify mode does list the UDMA3/4 modes twice
>> [1], which looks odd.

That got fixed a few revisions ago. Update your copy of hdparm
from the masters on sourceforge.

Cheers

2007-05-31 22:37:34

by Jeff Garzik

[permalink] [raw]
Subject: Re: Compact Flash performance...

Mark Lord wrote:
> Some cards may perform better when their "memory" interface is used
> instead of the "I/O" interface, or vice-versa. I'm not sure which
> of the two methods was selected by libata (probably the "memory"
> interface).

I am very CF-ignorant. How does libata select a memory or I/O interface
on a CF device?

Jeff


2007-05-31 22:40:29

by Mark Lord

[permalink] [raw]
Subject: Re: Compact Flash performance...

Daniel J Blueman wrote:
> On 31/05/07, Mark Lord <[email protected]> wrote:
>> Daniel J Blueman wrote:
...
>> I don't know too much about the specifics, though, but perhaps the
>> card is only capable of full speed in PIO6, which requires special
>> cabling
>> and is currently unsupported in libata (?).
>
> Seems less likely, as the Extreme IV reader (and another) supports
> UDMA mode 4; in PIO mode 6, they apparently top out at 17MB/s [2],
> which seems reasonable.

That's pio4 (16.6666MBytes/sec).
pio6 should have the same cycle time as udma4.

>> Another factor, is that hdparm performs discrete, non-overlapping,
>> reads of 1MByte chunks for its timing test. Some drives cannot achieve
>> full performance with such (relatively) large gaps between IO's.
>
> 100MB transfers still achieve 32MB/s:

But internally libata is probably breaking those up into 64KB transfers,
with gaps between requests. The best it could do would be 128KB transfers.
To maximize throughput, some kind of host-queuing would be needed,
or just have the driver sit in a tight loop, starting the next I/O
immediately when the previous one finishes. Linux isn't that quick (yet).

Cheers

2007-05-31 22:43:58

by Mark Lord

[permalink] [raw]
Subject: Re: Compact Flash performance...

Jeff Garzik wrote:
> Mark Lord wrote:
>> Some cards may perform better when their "memory" interface is used
>> instead of the "I/O" interface, or vice-versa. I'm not sure which
>> of the two methods was selected by libata (probably the "memory"
>> interface).
>
> I am very CF-ignorant. How does libata select a memory or I/O interface
> on a CF device?

Right. Usually we cannot select them, as it's the wires between
the ATA chipset (motherboard) and the CFCARD that determine this.

So I suppose this means that most implementations are using the I/O access method,
except for some embedded systems where the CFCARD is wired to the host bus
without a separate "controller" chip in between.

Cheers

2007-05-31 23:27:08

by Jeff Garzik

[permalink] [raw]
Subject: Re: Compact Flash performance...

Mark Lord wrote:
> To maximize throughput, some kind of host-queuing would be needed,
> or just have the driver sit in a tight loop, starting the next I/O
> immediately when the previous one finishes. Linux isn't that quick (yet).


I was talking on IRC with Tejun just recently. There are several
controllers (and/or "situations") like this, where some amount of host
queueing would permit greater throughput, even when NCQ is not
supported. sata_sx4 is the most dramatic example, where host queueing
could potentially increase speed by a factor of 10 or more, since it is
penalized by an awful two-irq-per-command (w/ a per-host bottleneck to
boot) setup. Silicon Image has a "command buffer". And overall, I
designed ->qc_prep() hook separate from ->qc_issue() to enable the
prepartion of multiple commands such that it only takes a simple "go"
I/O to start a transaction, immediately after the previous one ends.

Jeff


2007-05-31 23:47:26

by Daniel J Blueman

[permalink] [raw]
Subject: Re: Compact Flash performance...

Hi Mark,

On 31/05/07, Mark Lord <[email protected]> wrote:
> Daniel J Blueman wrote:
> > On 31/05/07, Mark Lord <[email protected]> wrote:
> >> Daniel J Blueman wrote:
> >> > Whoops, yes. Here is the expected data:
> > [snip]
> >>
> >> Thanks. I'll use that data to update/validate future versions of hdparm.
> >> At UDMA66, it *should* be capable of the 40MByte/sec realm of readback
> >> perf,
> >> assuming the card itself is really that fast.
> >
> > hdparm in the other identify mode does list the UDMA3/4 modes twice
> > [1], which looks odd.
> >
> >> I don't know too much about the specifics, though, but perhaps the
> >> card is only capable of full speed in PIO6, which requires special
> >> cabling
> >> and is currently unsupported in libata (?).
> >
> > Seems less likely, as the Extreme IV reader (and another) supports
> > UDMA mode 4; in PIO mode 6, they apparently top out at 17MB/s [2],
> > which seems reasonable.
> >
> >> Another factor, is that hdparm performs discrete, non-overlapping,
> >> reads of 1MByte chunks for its timing test. Some drives cannot achieve
> >> full performance with such (relatively) large gaps between IO's.
> >
> > 100MB transfers still achieve 32MB/s:
> >
> > # dd if=/dev/sdb of=/dev/null bs=100000k count=10
> > 10+0 records in
> > 10+0 records out
> > 1024000000 bytes (1.0 GB) copied, 31.9328 seconds, 32.1 MB/s
> >
> >> Also, just for fun, you could try "hdparm --direct -t /dev/sdb"
> >
> > # hdparm --direct -t /dev/sdb
> >
> > /dev/sdb:
> > Timing O_DIRECT disk reads: 96 MB in 3.05 seconds = 31.47 MB/sec
> >
> > It is conceivable that the controller in the two particular readers
> > which get 40MB/s are doing some kind of prefetching, but seems seems
> > like an extreme gain.
>
> Okay, here's the new hdparm information for this:
>
[snip]
> * CFA advanced modes: pio5 *pio6
>
> So, udma4 and pio6 are the fastest supported speeds.
> According to the CFA specifications (v4.1), either of those modes
> requires SHORT cables and special handling. You probably have
> regular (16-18") cables, and libata doesn't support PIO6,
> and the motherboard chipset may not support the "special handling"
> requirements in other ways. Also, only one device on the cable.

It makes sense for Linux to default to normal cable lengths in absence
of some mechanism to detect or specify this; in this case my CF card
is plugged directly into the motherboard with a CF adapter [1], so one
device, short traces. Anyway, I'd imagine the "special handling" and
other requirements have been introduced/influenced by vendors, so
possibly more special-cased than possible here.

> I see from your earlier posting that libata selected UDMA/66 (udma4)
> for the device though, since libata doesn't know that your cable
> is too long. And that mode is working, so that's probably as good
> as it gets on that particular motherboard chipset.

I couldn't find a kernel parameter to specify if I have a long/short
cable. Is there a way?

> Some cards may perform better when their "memory" interface is used
> instead of the "I/O" interface, or vice-versa. I'm not sure which
> of the two methods was selected by libata (probably the "memory" interface).
>
> There is also a "PC-Card" style interface with shared-memory,
> which some USB readers *may* use as an alternative to the standard
> IDE/ATA style interface.
>
> Cheers

Thanks for the detail in the other mails too; it's useful,
Daniel

--- [1]

http://img.inkfrog.com/pix/kitty.wun/Female_CF_type3.jpg
--
Daniel J Blueman

2007-06-01 00:00:27

by Robert Hancock

[permalink] [raw]
Subject: Re: Compact Flash performance...

Jeff Garzik wrote:
> Mark Lord wrote:
>> To maximize throughput, some kind of host-queuing would be needed,
>> or just have the driver sit in a tight loop, starting the next I/O
>> immediately when the previous one finishes. Linux isn't that quick
>> (yet).
>
>
> I was talking on IRC with Tejun just recently. There are several
> controllers (and/or "situations") like this, where some amount of host
> queueing would permit greater throughput, even when NCQ is not
> supported. sata_sx4 is the most dramatic example, where host queueing
> could potentially increase speed by a factor of 10 or more, since it is
> penalized by an awful two-irq-per-command (w/ a per-host bottleneck to
> boot) setup. Silicon Image has a "command buffer". And overall, I
> designed ->qc_prep() hook separate from ->qc_issue() to enable the
> prepartion of multiple commands such that it only takes a simple "go"
> I/O to start a transaction, immediately after the previous one ends.
>
> Jeff

Theoretically NVIDIA nForce4 ADMA could likely do this as well, as it
seems to allow chaining up multiple commands to execute in succession
(assuming they're not NCQ)..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-06-02 05:10:33

by Willy Tarreau

[permalink] [raw]
Subject: Re: Compact Flash performance...

On Thu, May 31, 2007 at 06:43:46PM -0400, Mark Lord wrote:
> Jeff Garzik wrote:
> >Mark Lord wrote:
> >>Some cards may perform better when their "memory" interface is used
> >>instead of the "I/O" interface, or vice-versa. I'm not sure which
> >>of the two methods was selected by libata (probably the "memory"
> >>interface).
> >
> >I am very CF-ignorant. How does libata select a memory or I/O interface
> >on a CF device?
>
> Right. Usually we cannot select them, as it's the wires between
> the ATA chipset (motherboard) and the CFCARD that determine this.

CF cards support 3 modes (MEM, I/O and True IDE), and neither MEM nor I/O
modes can talk IDE. Most often, the PIN 9 is simply shorted to the ground
at the connector to set the card in True IDE mode, which makes it emulate
a standard IDE disk.

Cheers,
Willy