The first open hardware docs for the Promise SX4 (sata_sx4) series are
now available:
http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-dimm-1.6.pdf.bz2
http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-pll-ata-timing-1.2.pdf.bz2
These are only small, ancillary guides; the main hardware doc should be
opened soon.
However, I would like to take this opportunity to point hackers looking
for a project at this hardware. The Promise SX4 is pretty neat, and it
needs more attention than I can give, to reach its full potential.
Here's a braindump:
* It's an older chipset/board, probably not actively sold anymore
* ATA programming interface very close to sata_promise (PDC2037x)
* Contains on-board DIMM, to be used for any purpose the driver desires
* Contains on-board RAID5 XOR, also fully programmable
A key problem is that, under Linux, sata_sx4 cannot fully exploit the
RAID-centric power of this hardware by driving the hardware in "dumb ATA
mode" as it does.
A better driver would notice when a RAID1 or RAID5 array contains
multiple components attached to the SX4, and send only a single copy of
the data to the card (saving PCI bus bandwidth tremendously).
Similarly, a better driver would take advantage of the RAID5 XOR offload
capabilities, to offload the entire RAID5 read or write transaction to
the card.
All this is difficult within either the MD or DM RAID frameworks,
because optimizing each RAID transaction requires intimate knowledge of
the hardware. We have the knowledge... but I don't have good ideas --
aside from an SX4-specific RAID 0/1/5/6 driver -- on how to exploit this
knowledge.
Traditionally the vendor has distributed a SCSI driver that implements
the necessary RAID stack pieces entirely in the hardware driver itself.
That sort of approach definitely works, but is traditionally rejected
by upstream maintainers because it essentially requires a third (if h/w
specific) RAID stack.
Jeff
On 1/3/07, Jeff Garzik <[email protected]> wrote:
> The first open hardware docs for the Promise SX4 (sata_sx4) series are
> now available:
> http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-dimm-1.6.pdf.bz2
> http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-pll-ata-timing-1.2.pdf.bz2
>
> These are only small, ancillary guides; the main hardware doc should be
> opened soon.
>
> However, I would like to take this opportunity to point hackers looking
> for a project at this hardware. The Promise SX4 is pretty neat, and it
> needs more attention than I can give, to reach its full potential.
>
> Here's a braindump:
>
> * It's an older chipset/board, probably not actively sold anymore
>
> * ATA programming interface very close to sata_promise (PDC2037x)
>
> * Contains on-board DIMM, to be used for any purpose the driver desires
>
> * Contains on-board RAID5 XOR, also fully programmable
>
> A key problem is that, under Linux, sata_sx4 cannot fully exploit the
> RAID-centric power of this hardware by driving the hardware in "dumb ATA
> mode" as it does.
>
> A better driver would notice when a RAID1 or RAID5 array contains
> multiple components attached to the SX4, and send only a single copy of
> the data to the card (saving PCI bus bandwidth tremendously).
> Similarly, a better driver would take advantage of the RAID5 XOR offload
> capabilities, to offload the entire RAID5 read or write transaction to
> the card.
>
> All this is difficult within either the MD or DM RAID frameworks,
> because optimizing each RAID transaction requires intimate knowledge of
> the hardware. We have the knowledge... but I don't have good ideas --
> aside from an SX4-specific RAID 0/1/5/6 driver -- on how to exploit this
> knowledge.
>
Can the XOR engines on the card access host memory or do they only
operate on the card's local memory? And are there DMA (copy) engines?
> Traditionally the vendor has distributed a SCSI driver that implements
> the necessary RAID stack pieces entirely in the hardware driver itself.
> That sort of approach definitely works, but is traditionally rejected
> by upstream maintainers because it essentially requires a third (if h/w
> specific) RAID stack.
>
[ brainstorming ]
For the RAID5 case: With the hardware acceleration patches the part of
MD that actually operates on the stripe cache is factored out of
handle_stripe5. So MD can be used to handle all of the cache state
information and then use a SX4 specific operations routine to handle
the data transfers. However this assumes that all the members of the
array are behind the SX4 (which is the same assumption the vendor
driver makes so maybe its not so bad).
The part I am lost at it is how do you tell libata and the lldd that
data for a transaction is coming from or headed to device local memory
and not host memory? It seems there would need to be changes to the
dma api to handle this distinction?
> Jeff
>
Dan
Dan Williams wrote:
> On 1/3/07, Jeff Garzik <[email protected]> wrote:
>> The first open hardware docs for the Promise SX4 (sata_sx4) series are
>> now available:
>> http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-dimm-1.6.pdf.bz2
>>
>> http://gkernel.sourceforge.net/specs/promise/pdc20621-pguide-pll-ata-timing-1.2.pdf.bz2
>>
>>
>> These are only small, ancillary guides; the main hardware doc should be
>> opened soon.
>>
>> However, I would like to take this opportunity to point hackers looking
>> for a project at this hardware. The Promise SX4 is pretty neat, and it
>> needs more attention than I can give, to reach its full potential.
>>
>> Here's a braindump:
>>
>> * It's an older chipset/board, probably not actively sold anymore
>>
>> * ATA programming interface very close to sata_promise (PDC2037x)
>>
>> * Contains on-board DIMM, to be used for any purpose the driver desires
>>
>> * Contains on-board RAID5 XOR, also fully programmable
>>
>> A key problem is that, under Linux, sata_sx4 cannot fully exploit the
>> RAID-centric power of this hardware by driving the hardware in "dumb ATA
>> mode" as it does.
>>
>> A better driver would notice when a RAID1 or RAID5 array contains
>> multiple components attached to the SX4, and send only a single copy of
>> the data to the card (saving PCI bus bandwidth tremendously).
>> Similarly, a better driver would take advantage of the RAID5 XOR offload
>> capabilities, to offload the entire RAID5 read or write transaction to
>> the card.
>>
>> All this is difficult within either the MD or DM RAID frameworks,
>> because optimizing each RAID transaction requires intimate knowledge of
>> the hardware. We have the knowledge... but I don't have good ideas --
>> aside from an SX4-specific RAID 0/1/5/6 driver -- on how to exploit this
>> knowledge.
>>
> Can the XOR engines on the card access host memory or do they only
> operate on the card's local memory? And are there DMA (copy) engines?
XOR only operates on the card's local memory.
There is one DMA-copy engine, for copying host<->card memory. It can be
polled or interrupt-driven. There are four ATA engines (one for each
port), each with DMA engines of their own that only talk to the card's
local memory.
>> Traditionally the vendor has distributed a SCSI driver that implements
>> the necessary RAID stack pieces entirely in the hardware driver itself.
>> That sort of approach definitely works, but is traditionally rejected
>> by upstream maintainers because it essentially requires a third (if h/w
>> specific) RAID stack.
>>
> [ brainstorming ]
> For the RAID5 case: With the hardware acceleration patches the part of
> MD that actually operates on the stripe cache is factored out of
> handle_stripe5. So MD can be used to handle all of the cache state
> information and then use a SX4 specific operations routine to handle
> the data transfers. However this assumes that all the members of the
> array are behind the SX4 (which is the same assumption the vendor
> driver makes so maybe its not so bad).
>
> The part I am lost at it is how do you tell libata and the lldd that
> data for a transaction is coming from or headed to device local memory
> and not host memory? It seems there would need to be changes to the
> dma api to handle this distinction?
That is indeed the tough part. Open questions. I'm not even convinced
that its a good idea to pretend that card local memory is accessible
through the existing DMA API... abstracting that may cost more than
inventing a new API for such a situation.
Jeff