2006-10-13 23:56:29

by djwong

[permalink] [raw]
Subject: [PATCH] libsas: support NCQ for SATA disks

This patch adds SATAII NCQ support to libsas. Both the use_ncq and the
dma_xfer flags in ata_task must be set for NCQ to work correctly on the
Adaptec SAS controller. The rest of the patch adds ATA_FLAG_NCQ to
sata_port_info and sets up ap->scsi_host so that ata_setup_ncq doesn't
crash. Please note that this patch is against the aic94xx-sas git tree,
not scsi-misc. Thanks also to James Bottomley for providing an earlier
version of this patch from which to work.

I've tested this patch on a x206m with a ST380819AS SATA2 disk plugged
into the Adaptec SAS controller. The drive came up with a queue depth
of 31, and I successfully ran an I/O flood test to coerce libata into
sending multiple commands simultaneously. A kernel probe recorded the
maximum tag number that had been seen before and after the flood test;
before the test it was 2 and after it was 30, as I expected.

Signed-off-by: Darrick J. Wong <[email protected]>

--

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 60d13b1..77a9be8 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -673,8 +673,14 @@ static unsigned int sas_ata_qc_issue(str
task->ata_task.retry_count = 1;
task->task_state_flags = SAS_TASK_STATE_PENDING;

- if (qc->tf.protocol == ATA_PROT_DMA)
+ switch (qc->tf.protocol) {
+ case ATA_PROT_NCQ:
+ task->ata_task.use_ncq = 1;
+ /* fall through */
+ case ATA_PROT_DMA:
task->ata_task.dma_xfer = 1;
+ break;
+ }

if (sas_ha->lldd_max_execute_num < 2)
res = i->dft->lldd_execute_task(task, 1, GFP_ATOMIC);
@@ -807,7 +813,7 @@ static struct ata_port_operations sas_sa

static struct ata_port_info sata_port_info = {
.flags = ATA_FLAG_SATA | ATA_FLAG_NO_LEGACY | ATA_FLAG_SATA_RESET |
- ATA_FLAG_MMIO | ATA_FLAG_PIO_DMA,
+ ATA_FLAG_MMIO | ATA_FLAG_PIO_DMA | ATA_FLAG_NCQ,
.pio_mask = 0x1f, /* PIO0-4 */
.mwdma_mask = 0x07, /* MWDMA0-2 */
.udma_mask = ATA_UDMA6,
@@ -875,6 +881,7 @@ int sas_target_alloc(struct scsi_target

ap->private_data = found_dev;
ap->cbl = ATA_CBL_SATA;
+ ap->scsi_host = shost;
found_dev->sata_dev.ap = ap;
}


2006-10-15 17:42:10

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] libsas: support NCQ for SATA disks

On Fri, 2006-10-13 at 16:56 -0700, Darrick J. Wong wrote:
> I've tested this patch on a x206m with a ST380819AS SATA2 disk plugged
> into the Adaptec SAS controller. The drive came up with a queue depth
> of 31, and I successfully ran an I/O flood test to coerce libata into
> sending multiple commands simultaneously. A kernel probe recorded the
> maximum tag number that had been seen before and after the flood test;
> before the test it was 2 and after it was 30, as I expected.

This doesn't seem to quite work for me on a SATA-1 disc:

sas: DOING DISCOVERY on port 1, pid:1897
sas: sas_ata_phy_reset: Found ATA device.
ata1.00: ATA-7, max UDMA/133, 781422768 sectors: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 2:0:1:0: Direct-Access ATA ST3400832AS 3.03 PQ: 0
ANSI: 5
SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
sdc: Write Protect is off
SCSI device sdc: drive cache: write back
SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
sdc: Write Protect is off
SCSI device sdc: drive cache: write back
sdc: unknown partition table
sd 2:0:1:0: Attached scsi disk sdc
sas: DONE DISCOVERY on port 1, pid:1897, result:0
sas: command 0xf785f3c0, task 0x00000000, timed out: EH_HANDLED
sas: command 0xf785f3c0, task 0x00000000, timed out: EH_HANDLED
[...]

It looks like the first few commands get through (read capacity, ATA
IDENTIFY etc) and it hangs up on the read for the partition table.

James


2006-10-15 18:43:19

by djwong

[permalink] [raw]
Subject: Re: [PATCH] libsas: support NCQ for SATA disks

James Bottomley wrote:

> This doesn't seem to quite work for me on a SATA-1 disc:
>
> sas: DOING DISCOVERY on port 1, pid:1897
> sas: sas_ata_phy_reset: Found ATA device.
> ata1.00: ATA-7, max UDMA/133, 781422768 sectors: LBA48 NCQ (depth 31/32)
> ata1.00: configured for UDMA/133
> scsi 2:0:1:0: Direct-Access ATA ST3400832AS 3.03 PQ: 0
> ANSI: 5
> SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
> sdc: Write Protect is off
> SCSI device sdc: drive cache: write back
> SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
> sdc: Write Protect is off
> SCSI device sdc: drive cache: write back
> sdc: unknown partition table
> sd 2:0:1:0: Attached scsi disk sdc
> sas: DONE DISCOVERY on port 1, pid:1897, result:0
> sas: command 0xf785f3c0, task 0x00000000, timed out: EH_HANDLED
> sas: command 0xf785f3c0, task 0x00000000, timed out: EH_HANDLED
> [...]
>
> It looks like the first few commands get through (read capacity, ATA
> IDENTIFY etc) and it hangs up on the read for the partition table.

Hm... if I put in some debug printks in the qc_issue code, I get the
same symptoms. I've observed that once again we get hung up on ATA
commands where the tag number > 0. I also noticed this pattern:

1. ATA command w/ tag 0 (command A) issued.
2. Command A goes out to sas-ata.
2. ATA command w/ tag 1 (command B) issued.
3. Command A completes
4. Command B goes out to sas-ata.
[...]
5. Command B times out.

Very odd that this all works if there are no printks. I don't see
anything obvious that would suggest why this apparent race seems to
happen--unless there's some conflict between issuing an ATA command
while completing another one.

--D

2006-10-15 19:42:56

by Luben Tuikov

[permalink] [raw]
Subject: Re: [PATCH] libsas: support NCQ for SATA disks

--- "Darrick J. Wong" <[email protected]> wrote:
> James Bottomley wrote:
>
> > This doesn't seem to quite work for me on a SATA-1 disc:
> >
> > sas: DOING DISCOVERY on port 1, pid:1897
> > sas: sas_ata_phy_reset: Found ATA device.
> > ata1.00: ATA-7, max UDMA/133, 781422768 sectors: LBA48 NCQ (depth 31/32)
> > ata1.00: configured for UDMA/133
> > scsi 2:0:1:0: Direct-Access ATA ST3400832AS 3.03 PQ: 0
> > ANSI: 5
> > SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
> > sdc: Write Protect is off
> > SCSI device sdc: drive cache: write back
> > SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
> > sdc: Write Protect is off
> > SCSI device sdc: drive cache: write back
> > sdc: unknown partition table
> > sd 2:0:1:0: Attached scsi disk sdc
> > sas: DONE DISCOVERY on port 1, pid:1897, result:0
> > sas: command 0xf785f3c0, task 0x00000000, timed out: EH_HANDLED
> > sas: command 0xf785f3c0, task 0x00000000, timed out: EH_HANDLED
> > [...]
> >
> > It looks like the first few commands get through (read capacity, ATA
> > IDENTIFY etc) and it hangs up on the read for the partition table.
>
> Hm... if I put in some debug printks in the qc_issue code, I get the
> same symptoms. I've observed that once again we get hung up on ATA
> commands where the tag number > 0. I also noticed this pattern:
>
> 1. ATA command w/ tag 0 (command A) issued.
> 2. Command A goes out to sas-ata.
> 2. ATA command w/ tag 1 (command B) issued.
> 3. Command A completes
> 4. Command B goes out to sas-ata.
> [...]
> 5. Command B times out.
>
> Very odd that this all works if there are no printks. I don't see
> anything obvious that would suggest why this apparent race seems to
> happen--unless there's some conflict between issuing an ATA command
> while completing another one.

Keep debugging.

The GPL open sourced SCSI/ATA Translation Layer (SATL) I maintain,
works perfectly for any SATA drive w/ NCQ (through the aic94xx
SDS/interconnect), and I haven't heard any complaints from people
using it.

Luben

2006-10-16 22:40:38

by Brian King

[permalink] [raw]
Subject: Re: [PATCH] libsas: support NCQ for SATA disks

Darrick J. Wong wrote:
> This patch adds SATAII NCQ support to libsas. Both the use_ncq and the
> dma_xfer flags in ata_task must be set for NCQ to work correctly on the
> Adaptec SAS controller. The rest of the patch adds ATA_FLAG_NCQ to
> sata_port_info and sets up ap->scsi_host so that ata_setup_ncq doesn't
> crash. Please note that this patch is against the aic94xx-sas git tree,
> not scsi-misc. Thanks also to James Bottomley for providing an earlier
> version of this patch from which to work.


> @@ -875,6 +881,7 @@ int sas_target_alloc(struct scsi_target
>
> ap->private_data = found_dev;
> ap->cbl = ATA_CBL_SATA;
> + ap->scsi_host = shost;
> found_dev->sata_dev.ap = ap;
> }
>

This doesn't look like the right fix for the oops you were seeing. The
SAS usage of libata has ap->scsi_host as NULL, which indicates that
libata does not own the associated scsi_host. I'm concerned you may
have broken some other code path by making this change. I think the correct
fix may require removing the dependence of ap->scsi_host from
ata_dev_config_ncq.

Brian

--
Brian King
eServer Storage I/O
IBM Linux Technology Center

2006-10-17 11:15:44

by Jeff Garzik

[permalink] [raw]
Subject: Re: [PATCH] libsas: support NCQ for SATA disks

Brian King wrote:
> This doesn't look like the right fix for the oops you were seeing. The
> SAS usage of libata has ap->scsi_host as NULL, which indicates that
> libata does not own the associated scsi_host. I'm concerned you may
> have broken some other code path by making this change. I think the correct
> fix may require removing the dependence of ap->scsi_host from
> ata_dev_config_ncq.

Yep. I had already mentioned this on IRC.

Jeff


2006-10-17 16:38:40

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] libsas: support NCQ for SATA disks

On Tue, 2006-10-17 at 07:15 -0400, Jeff Garzik wrote:
> Brian King wrote:
> > This doesn't look like the right fix for the oops you were seeing. The
> > SAS usage of libata has ap->scsi_host as NULL, which indicates that
> > libata does not own the associated scsi_host. I'm concerned you may
> > have broken some other code path by making this change. I think the correct
> > fix may require removing the dependence of ap->scsi_host from
> > ata_dev_config_ncq.
>
> Yep. I had already mentioned this on IRC.

I understand, but right at the moment, my priority is sorting out the
aic94xx driver so that it works with SATA devices. It has become
apparent that there's some need for a bit of code sorting out in libata
to drive intelligent sas controllers, so we can take a look at bugs in
ata_dev_config_ncq() when someone's time frees up to look into the
libata issues.

James



2006-10-17 22:42:24

by Luben Tuikov

[permalink] [raw]
Subject: Re: [PATCH] libsas: support NCQ for SATA disks

--- James Bottomley <[email protected]> wrote:
> On Tue, 2006-10-17 at 07:15 -0400, Jeff Garzik wrote:
> > Brian King wrote:
> > > This doesn't look like the right fix for the oops you were seeing. The
> > > SAS usage of libata has ap->scsi_host as NULL, which indicates that
> > > libata does not own the associated scsi_host. I'm concerned you may
> > > have broken some other code path by making this change. I think the correct
> > > fix may require removing the dependence of ap->scsi_host from
> > > ata_dev_config_ncq.
> >
> > Yep. I had already mentioned this on IRC.
>
> I understand, but right at the moment, my priority is sorting out the
> aic94xx driver so that it works with SATA devices. It has become
> apparent that there's some need for a bit of code sorting out in libata
> to drive intelligent sas controllers, so we can take a look at bugs in
> ata_dev_config_ncq() when someone's time frees up to look into the
> libata issues.

Just to give people some freedom of choice:

I feel the need to mention that my code as I support it, the
SAS Stack, does include fully T10-complient SATL, which does SAT on
per device basis, leaving the transport be whatever it need be.
I.e. no messing around with "ata ports" or "hosts" or what not.

A provider (SAS Stack, software ATA emulation, whatever) simply
_registers_ with SATL, declaring in a structure of method stubs
what it can do, and then SATL drives the device. This is similar to
how my SAS Stack works with SAS LLDD (the bottomleyised version of my
code uses a libsas "on the side" instead of being a fully fledged
layer as intended and originally written).

The SATL also does SATA feature configuration as per the declared
capabilities of the LLDD (aic94xx, or whatever) _and_ as per the
device declared capabilities. I.e. if one or the other doesn't
support it, then it is turned off, if both support it it is on.

The SATL also supports NCQ, S-ATAPI, full error recovery (NCQ, etc).

It also supports MODE SENSE, MODE SELECT, ATA Pass-through (all protocols),
S-ATAPI, etc, etc, as per SAT.*

Bottomley has simply made a decision for you, what kind of code
you should be using, what it supports and how it supports it.

Good luck everyone!
Luben

* For example, I exclusively cut CDs/DVDs using a S-ATAPI DVD-RW device
attached to an expander using the SAS Stack and SATL.