2008-01-07 04:12:29

by Robert Hancock

[permalink] [raw]
Subject: Re: Believed resolved: SATA kern-buffRd read slow: based on promise driver bug

Linda Walsh wrote:
> Mikael Pettersson wrote:
>> Linda Walsh writes:
>> > Mikael Pettersson wrote:
>> > > Linda Walsh writes:
>> > > > > Linda Walsh wrote:
>> > > > >>>> read rate began falling; (.25 - .3); > > > more
>> importantly; a chronic error message associated
>> > > > with drive may be causing some or all of the problem(s):
>> > > > ---
>> > > > ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
>> > > > ata1.00: port_status 0x20080000
>> > > > ata1.00: cmd c8/00:10:30:06:03/00:00:00:00:00/e0 tag 0 cdb 0x0
>> data 8192 in
>> > > > res 50/00:00:3f:06:03/00:00:00:00:00/e0 Emask 0x2
>> (HSM violation)
>> > > > ata1: limiting SATA link speed to 1.5 Gbps
>> > >
>> > > Looks like the Promise ASIC SG bug. Apply
>> > >
>> <http://user.it.uu.se/~mikpe/linux/patches/sata_promise/patch-sata_promise-1-asic-sg-bug-fix-v3-2.6.23>
>>
>> > > and let us know if things improve.
>> > > /Mikael
>> > ---
>> > Yep! Hope that's making it into a patch soon or, at least 2.6.24.
>> > Kernel buffered
>> Good to hear that it solved this problem.
>> The patch is in 2.6.24-rc2 and newer kernels, and will be sent
>> to -stable for the 2.6.23 and 2.6.22 series.
>>
> ---
> Will 'likely' wait till -stable since I use the machine as a 'server'
> for just about any/everything that needs "serving" or "proxy" services.
>> > That and I'd like to find out why TCQ/NCQ doesn't work with the
>> Seagate drives --
>>
>> The driver doesn't yet support NCQ.
> ----
> Is 'main' diff between NCQ/TCQ that TCQ can re-arrange 'write'
> priority under driver control, whereas NCQ is mostly a FIFO queue?

First off one has to distinguish between ATA TCQ and SCSI TCQ. ATA TCQ
is essentially abandoned, very few drives and fewer still controllers
and matching drivers ever supported it.

SCSI TCQ has "head of queue", "ordered" and "simple" queueing modes. ATA
NCQ effectively only has "simple" where the drive always decides what
order to service the requests in. There is a FUA mode, which tells the
drive that the command (normally a write) has to access the physical
media before reporting completion.

>
> On a Journal'ed file system, isn't "write-order" control required
> for integrity? That would seem to imply TCQ could be used, but
> NCQ doesn't seem to offer much benefit, since the higher level
> kernel drivers usually have a "larger picture" of sectors that need
> to be written. The only advantage I can see for NCQ drives might

There are cases where writes need to complete in a specific order. This
can be done either by using FUA bits (though libata doesn't do this by
default currently) or by issuing cache flushes before and after certain
commands.

> be that the kernel may not know the drive's internal physical
> structure nor where the disk is in its current revolution. That could
> allow drive write re-ordering where based on the exact-current state
> of the drive that the kernel might not have access to, but it seems
> this would be a minor benefit -- and, depending on firmware,
> possibly higher overhead in command processing?

That's a big part of it. The kernel doesn't necessarily know what
sectors will be the fastest to access at any given time whereas the
drive can.

Also, NCQ has some other improvements that are independent of actually
queueing commands - for instance, because the drive controls the DMA
data transfer, it can transfer the data for one request in an
out-of-order fashion instead of having to transfer to the host strictly
from beginning to end as in traditional ATA.

>
> Am trying to differentiate NCQ/TCQ and SAS v. SCSI benefits.
> It seems both support (SAS & SATA) some type of port-multiplier/
> multiplexor/ option to allow more disks/port.
>
> However, (please correct?) SATA uses a hub type architecture while
> SAS uses a switch architecture. My experience with network hubs vs.
> switches is that network hubs can be much slower if there is
> communication contention. Is the word 'hub' being used in the
> "shared-communication media sense", or is someone using the term
> 'hub' as a [sic] replacement for a 'switch'?

I believe that they're essentially the same in that regard, though
someone can correct me if I'm wrong..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/