Hi all,
the block layer usually tries to merge s/g segments if consecutive
segments combined fit into the queue's max_segment_size. When such a
scatter gather list is DMA-mapped, can it happen that an IOMMU collapses
the elements even further, so that sg_dma_len() of a DMA-mapped s/g
segment exceeds max_segment_size?
As I understood some discussions in the past, this could indeed happen,
which is a nuisance. But I may have misunderstood something, or
something may have changed in the meantime...
--
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/
On Fri, Aug 8, 2008 at 12:44 PM, Stefan Richter
<[email protected]> wrote:
> Hi all,
>
> the block layer usually tries to merge s/g segments if consecutive segments
> combined fit into the queue's max_segment_size. When such a scatter gather
> list is DMA-mapped, can it happen that an IOMMU collapses the elements even
> further, so that sg_dma_len() of a DMA-mapped s/g segment exceeds
> max_segment_size?
I don't see how. The IOMMU code only collapses the "physical" mappings and
does not add new elements to the SG list. ergo sg_dma_len() shouldn't change.
grant
>
> As I understood some discussions in the past, this could indeed happen,
> which is a nuisance. But I may have misunderstood something, or something
> may have changed in the meantime...
> --
> Stefan Richter
> -=====-==--- =--- -=---
> http://arcgraph.de/sr/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Grant Grundler wrote:
> On Fri, Aug 8, 2008 at 12:44 PM, Stefan Richter
> <[email protected]> wrote:
>> Hi all,
>>
>> the block layer usually tries to merge s/g segments if consecutive segments
>> combined fit into the queue's max_segment_size. When such a scatter gather
>> list is DMA-mapped, can it happen that an IOMMU collapses the elements even
>> further, so that sg_dma_len() of a DMA-mapped s/g segment exceeds
>> max_segment_size?
>
> I don't see how. The IOMMU code only collapses the "physical" mappings and
> does not add new elements to the SG list. ergo sg_dma_len() shouldn't change.
>
> grant
>
>> As I understood some discussions in the past, this could indeed happen,
>> which is a nuisance. But I may have misunderstood something, or something
>> may have changed in the meantime...
Well, I'm just doing my homework and am tracking down the various
dma_map_sg implementations.
Here is what PPC does:
http://lxr.linux.no/linux+v2.6.26/arch/powerpc/kernel/iommu.c#L270
It looks at dma_get_max_seg_size(dev); and then merges according to it.
That's all nice and well, but in my case (FireWire storage protocol
a.k.a. SBP-2, which is basically remote DMA), the max_segment_size of
the PCI device is different from the size limit of the protocol. We
currently have to deconstruct such merges in the sbp2 drivers again:
http://lxr.linux.no/linux+v2.6.26/drivers/firewire/fw-sbp2.c#L1384
Either I keep it that way, or I let the protocol driver manipulate the
FireWire controller's dev->dma_parms->max_segment_size (via
dma_set_max_seg_size() of course), which is not entirely correct.
I first wanted to use blk_queue_max_segment_size(), but that falls short
with dma_map_sg implementations like the above are at work.
--
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/
On Fri, 08 Aug 2008 23:21:28 +0200
Stefan Richter <[email protected]> wrote:
> Grant Grundler wrote:
> > On Fri, Aug 8, 2008 at 12:44 PM, Stefan Richter
> > <[email protected]> wrote:
> >> Hi all,
> >>
> >> the block layer usually tries to merge s/g segments if consecutive segments
> >> combined fit into the queue's max_segment_size. When such a scatter gather
> >> list is DMA-mapped, can it happen that an IOMMU collapses the elements even
> >> further, so that sg_dma_len() of a DMA-mapped s/g segment exceeds
> >> max_segment_size?
> >
> > I don't see how. The IOMMU code only collapses the "physical" mappings and
> > does not add new elements to the SG list. ergo sg_dma_len() shouldn't change.
> >
> > grant
> >
> >> As I understood some discussions in the past, this could indeed happen,
> >> which is a nuisance. But I may have misunderstood something, or something
> >> may have changed in the meantime...
>
> Well, I'm just doing my homework and am tracking down the various
> dma_map_sg implementations.
>
> Here is what PPC does:
> http://lxr.linux.no/linux+v2.6.26/arch/powerpc/kernel/iommu.c#L270
>
> It looks at dma_get_max_seg_size(dev); and then merges according to it.
Yes, IOMMUs were not able to handle this issue but they should now.
> That's all nice and well, but in my case (FireWire storage protocol
> a.k.a. SBP-2, which is basically remote DMA), the max_segment_size of
> the PCI device is different from the size limit of the protocol. We
> currently have to deconstruct such merges in the sbp2 drivers again:
> http://lxr.linux.no/linux+v2.6.26/drivers/firewire/fw-sbp2.c#L1384
>
> Either I keep it that way, or I let the protocol driver manipulate the
> FireWire controller's dev->dma_parms->max_segment_size (via
> dma_set_max_seg_size() of course), which is not entirely correct.
Why is it not correct? Device drivers can set
dma_set_max_seg_size(). SCSI does that (see __scsi_alloc_queue).
> I first wanted to use blk_queue_max_segment_size(), but that falls short
> with dma_map_sg implementations like the above are at work.
Yeah, we store the same value at two places (request_queue and device)
though it's not nice.
FUJITA Tomonori wrote:
> On Fri, 08 Aug 2008 23:21:28 +0200
> Stefan Richter <[email protected]> wrote:
>> Here is what PPC does:
>> http://lxr.linux.no/linux+v2.6.26/arch/powerpc/kernel/iommu.c#L270
>>
>> It looks at dma_get_max_seg_size(dev); and then merges according to it.
>
> Yes, IOMMUs were not able to handle this issue but they should now.
>
>
>> That's all nice and well, but in my case (FireWire storage protocol
>> a.k.a. SBP-2, which is basically remote DMA), the max_segment_size of
>> the PCI device is different from the size limit of the protocol. We
>> currently have to deconstruct such merges in the sbp2 drivers again:
>> http://lxr.linux.no/linux+v2.6.26/drivers/firewire/fw-sbp2.c#L1384
>>
>> Either I keep it that way, or I let the protocol driver manipulate the
>> FireWire controller's dev->dma_parms->max_segment_size (via
>> dma_set_max_seg_size() of course), which is not entirely correct.
>
> Why is it not correct? Device drivers can set
> dma_set_max_seg_size(). SCSI does that (see __scsi_alloc_queue).
FireWire is a multi-protocol bus. SBP-2 is just one of many quite
different protocols. SBP-2 targets read or write the initiators memory
buffers via remote DMA. These buffer may be exposed as s/g lists to the
target. The protocol limits these s/g lists to up to 65535 elements of
up to 65535 bytes size each.
FireWire controllers on the other hand get their maximum segment size
set to 65536 by the PCI subsystem. (All FireWire controllers supported
by mainline Linux are PCI or PCIe devices.)
In case of the drivers/firewire/ stack, the SBP-2 driver is currently
the only one which uses dma_map_sg. In case of the drivers/ieee1394/
stack, also the drivers for isochronous protocols, including userspace
drivers via raw1394, use dma_map_sg.
So if the SBP-2 driver manipulated the controller device's
max_segment_size, it would influence the DMA mappings of the other
protocols. It wouldn't be a big deal; the isochronous mappings could
only be collapsed to chunks of at most 15 pages instead of 16 pages.
However, the mapping deconstructing code in the SBP-2 drivers is not a
big deal either.
--
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/
On Fri, 08 Aug 2008 23:58:23 +0200
Stefan Richter <[email protected]> wrote:
> FUJITA Tomonori wrote:
> > On Fri, 08 Aug 2008 23:21:28 +0200
> > Stefan Richter <[email protected]> wrote:
> >> Here is what PPC does:
> >> http://lxr.linux.no/linux+v2.6.26/arch/powerpc/kernel/iommu.c#L270
> >>
> >> It looks at dma_get_max_seg_size(dev); and then merges according to it.
> >
> > Yes, IOMMUs were not able to handle this issue but they should now.
> >
> >
> >> That's all nice and well, but in my case (FireWire storage protocol
> >> a.k.a. SBP-2, which is basically remote DMA), the max_segment_size of
> >> the PCI device is different from the size limit of the protocol. We
> >> currently have to deconstruct such merges in the sbp2 drivers again:
> >> http://lxr.linux.no/linux+v2.6.26/drivers/firewire/fw-sbp2.c#L1384
> >>
> >> Either I keep it that way, or I let the protocol driver manipulate the
> >> FireWire controller's dev->dma_parms->max_segment_size (via
> >> dma_set_max_seg_size() of course), which is not entirely correct.
> >
> > Why is it not correct? Device drivers can set
> > dma_set_max_seg_size(). SCSI does that (see __scsi_alloc_queue).
>
> FireWire is a multi-protocol bus. SBP-2 is just one of many quite
> different protocols. SBP-2 targets read or write the initiators memory
> buffers via remote DMA. These buffer may be exposed as s/g lists to the
> target. The protocol limits these s/g lists to up to 65535 elements of
> up to 65535 bytes size each.
>
> FireWire controllers on the other hand get their maximum segment size
> set to 65536 by the PCI subsystem. (All FireWire controllers supported
> by mainline Linux are PCI or PCIe devices.)
>
> In case of the drivers/firewire/ stack, the SBP-2 driver is currently
> the only one which uses dma_map_sg. In case of the drivers/ieee1394/
> stack, also the drivers for isochronous protocols, including userspace
> drivers via raw1394, use dma_map_sg.
Thanks for the explanation.
> So if the SBP-2 driver manipulated the controller device's
> max_segment_size, it would influence the DMA mappings of the other
> protocols. It wouldn't be a big deal; the isochronous mappings could
IMHO, setting the safest value to max_segment_size is fine. After all,
the maximum is only 65536 bytes.
> only be collapsed to chunks of at most 15 pages instead of 16 pages.
> However, the mapping deconstructing code in the SBP-2 drivers is not a
> big deal either.
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the ieee1394 stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
We can also uniformly use dma_map_sg() for the single segment case just
like for the multi segment case, to further simplify the code.
Also clean up how the page table is converted to big endian.
Signed-off-by: Stefan Richter <[email protected]>
---
Applicable after
[PATCH update] ieee1394: sbp2: stricter dma_sync
[PATCH] ieee1394: sbp2: check for DMA mapping failures
from a few minutes ago.
drivers/ieee1394/sbp2.c | 106 +++++++++++++---------------------------
drivers/ieee1394/sbp2.h | 37 +++++--------
2 files changed, 52 insertions(+), 91 deletions(-)
Index: linux-2.6.27-rc2/drivers/ieee1394/sbp2.h
===================================================================
--- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.h
+++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.h
@@ -139,13 +139,10 @@ struct sbp2_logout_orb {
u32 status_fifo_lo;
} __attribute__((packed));
-#define PAGE_TABLE_SET_SEGMENT_BASE_HI(v) ((v) & 0xffff)
-#define PAGE_TABLE_SET_SEGMENT_LENGTH(v) (((v) & 0xffff) << 16)
-
struct sbp2_unrestricted_page_table {
- u32 length_segment_base_hi;
- u32 segment_base_lo;
-} __attribute__((packed));
+ __be32 high;
+ __be32 low;
+};
#define RESP_STATUS_REQUEST_COMPLETE 0x0
#define RESP_STATUS_TRANSPORT_FAILURE 0x1
@@ -216,15 +213,18 @@ struct sbp2_status_block {
#define SBP2_UNIT_SPEC_ID_ENTRY 0x0000609e
#define SBP2_SW_VERSION_ENTRY 0x00010483
-
/*
- * SCSI specific definitions
+ * The default maximum s/g segment size of a FireWire controller is
+ * usually 0x10000, but SBP-2 only allows 0xffff. Since buffers have to
+ * be quadlet-aligned, we set the length limit to 0xffff & ~3.
*/
+#define SBP2_MAX_SEG_SIZE 0xfffc
-#define SBP2_MAX_SG_ELEMENT_LENGTH 0xf000
-/* There is no real limitation of the queue depth (i.e. length of the linked
+/*
+ * There is no real limitation of the queue depth (i.e. length of the linked
* list of command ORBs) at the target. The chosen depth is merely an
- * implementation detail of the sbp2 driver. */
+ * implementation detail of the sbp2 driver.
+ */
#define SBP2_MAX_CMDS 8
#define SBP2_SCSI_STATUS_GOOD 0x0
@@ -240,12 +240,6 @@ struct sbp2_status_block {
* Representations of commands and devices
*/
-enum sbp2_dma_types {
- CMD_DMA_NONE,
- CMD_DMA_PAGE,
- CMD_DMA_SINGLE
-};
-
/* Per SCSI command */
struct sbp2_command_info {
struct list_head list;
@@ -258,11 +252,10 @@ struct sbp2_command_info {
struct sbp2_unrestricted_page_table
scatter_gather_element[SG_ALL] __attribute__((aligned(8)));
dma_addr_t sge_dma;
- void *sge_buffer;
- dma_addr_t cmd_dma;
- enum sbp2_dma_types dma_type;
- unsigned long dma_size;
- enum dma_data_direction dma_dir;
+
+ struct scatterlist *sg_buffer;
+ int sg_count;
+ enum dma_data_direction sg_dir;
};
/* Per FireWire host */
Index: linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
===================================================================
--- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.c
+++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.c
@@ -656,23 +656,10 @@ static struct sbp2_command_info *sbp2uti
static void sbp2util_mark_command_completed(struct sbp2_lu *lu,
struct sbp2_command_info *cmd)
{
- struct hpsb_host *host = lu->ud->ne->host;
-
- if (cmd->cmd_dma) {
- if (cmd->dma_type == CMD_DMA_SINGLE)
- dma_unmap_single(host->device.parent, cmd->cmd_dma,
- cmd->dma_size, cmd->dma_dir);
- else if (cmd->dma_type == CMD_DMA_PAGE)
- dma_unmap_page(host->device.parent, cmd->cmd_dma,
- cmd->dma_size, cmd->dma_dir);
- /* XXX: Check for CMD_DMA_NONE bug */
- cmd->dma_type = CMD_DMA_NONE;
- cmd->cmd_dma = 0;
- }
- if (cmd->sge_buffer) {
- dma_unmap_sg(host->device.parent, cmd->sge_buffer,
- cmd->dma_size, cmd->dma_dir);
- cmd->sge_buffer = NULL;
+ if (cmd->sg_buffer) {
+ dma_unmap_sg(lu->ud->ne->host->device.parent, cmd->sg_buffer,
+ cmd->sg_count, cmd->sg_dir);
+ cmd->sg_buffer = NULL;
}
list_move_tail(&cmd->list, &lu->cmd_orb_completed);
}
@@ -827,6 +814,10 @@ static struct sbp2_lu *sbp2_alloc_device
#endif
}
+ if (dma_get_max_seg_size(hi->host->device.parent) > SBP2_MAX_SEG_SIZE)
+ BUG_ON(dma_set_max_seg_size(hi->host->device.parent,
+ SBP2_MAX_SEG_SIZE));
+
/* Prevent unloading of the 1394 host */
if (!try_module_get(hi->host->driver->owner)) {
SBP2_ERR("failed to get a reference on 1394 host driver");
@@ -1501,76 +1492,51 @@ static int sbp2_agent_reset(struct sbp2_
static int sbp2_prep_command_orb_sg(struct sbp2_command_orb *orb,
struct sbp2_fwhost_info *hi,
struct sbp2_command_info *cmd,
- unsigned int scsi_use_sg,
+ unsigned int sg_count,
struct scatterlist *sg,
u32 orb_direction,
enum dma_data_direction dma_dir)
{
struct device *dmadev = hi->host->device.parent;
+ struct sbp2_unrestricted_page_table *pt;
+ int i, j, len;
+
+ i = dma_map_sg(dmadev, sg, sg_count, dma_dir);
+ if (i == 0)
+ return -ENOMEM;
+
+ cmd->sg_buffer = sg;
+ cmd->sg_count = sg_count;
+ cmd->sg_dir = dma_dir;
- cmd->dma_dir = dma_dir;
orb->data_descriptor_hi = ORB_SET_NODE_ID(hi->host->node_id);
orb->misc |= ORB_SET_DIRECTION(orb_direction);
/* special case if only one element (and less than 64KB in size) */
- if (scsi_use_sg == 1 && sg->length <= SBP2_MAX_SG_ELEMENT_LENGTH) {
-
- cmd->dma_size = sg->length;
- cmd->dma_type = CMD_DMA_PAGE;
- cmd->cmd_dma = dma_map_page(dmadev, sg_page(sg), sg->offset,
- cmd->dma_size, cmd->dma_dir);
- if (dma_mapping_error(dmadev, cmd->cmd_dma)) {
- cmd->cmd_dma = 0;
- return -ENOMEM;
- }
-
- orb->data_descriptor_lo = cmd->cmd_dma;
- orb->misc |= ORB_SET_DATA_SIZE(cmd->dma_size);
-
+ if (i == 1) {
+ orb->data_descriptor_lo = sg_dma_address(sg);
+ orb->misc |= ORB_SET_DATA_SIZE(sg_dma_len(sg));
} else {
- struct sbp2_unrestricted_page_table *sg_element =
- &cmd->scatter_gather_element[0];
- u32 sg_count, sg_len;
- dma_addr_t sg_addr;
- int i, count = dma_map_sg(dmadev, sg, scsi_use_sg, dma_dir);
-
- cmd->dma_size = scsi_use_sg;
- cmd->sge_buffer = sg;
-
- /* use page tables (s/g) */
- orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1);
- orb->data_descriptor_lo = cmd->sge_dma;
+ pt = &cmd->scatter_gather_element[0];
+ j = 0;
dma_sync_single_for_cpu(dmadev, cmd->sge_dma,
sizeof(cmd->scatter_gather_element),
DMA_TO_DEVICE);
- /* loop through and fill out our SBP-2 page tables
- * (and split up anything too large) */
- for (i = 0, sg_count = 0; i < count; i++, sg = sg_next(sg)) {
- sg_len = sg_dma_len(sg);
- sg_addr = sg_dma_address(sg);
- while (sg_len) {
- sg_element[sg_count].segment_base_lo = sg_addr;
- if (sg_len > SBP2_MAX_SG_ELEMENT_LENGTH) {
- sg_element[sg_count].length_segment_base_hi =
- PAGE_TABLE_SET_SEGMENT_LENGTH(SBP2_MAX_SG_ELEMENT_LENGTH);
- sg_addr += SBP2_MAX_SG_ELEMENT_LENGTH;
- sg_len -= SBP2_MAX_SG_ELEMENT_LENGTH;
- } else {
- sg_element[sg_count].length_segment_base_hi =
- PAGE_TABLE_SET_SEGMENT_LENGTH(sg_len);
- sg_len = 0;
- }
- sg_count++;
- }
- }
+ for_each_sg(sg, sg, sg_count, i) {
+ len = sg_dma_len(sg);
+ if (len == 0)
+ continue;
- orb->misc |= ORB_SET_DATA_SIZE(sg_count);
+ pt[j].high = cpu_to_be32(len << 16);
+ pt[j].low = cpu_to_be32(sg_dma_address(sg));
+ j++;
+ }
- sbp2util_cpu_to_be32_buffer(sg_element,
- (sizeof(struct sbp2_unrestricted_page_table)) *
- sg_count);
+ orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1) |
+ ORB_SET_DATA_SIZE(j);
+ orb->data_descriptor_lo = cmd->sge_dma;
dma_sync_single_for_device(dmadev, cmd->sge_dma,
sizeof(cmd->scatter_gather_element),
@@ -2037,6 +2003,8 @@ static int sbp2scsi_slave_configure(stru
sdev->start_stop_pwr_cond = 1;
if (lu->workarounds & SBP2_WORKAROUND_128K_MAX_TRANS)
blk_queue_max_sectors(sdev->request_queue, 128 * 1024 / 512);
+
+ blk_queue_max_segment_size(sdev->request_queue, SBP2_MAX_SEG_SIZE);
return 0;
}
--
Stefan Richter
-=====-==--- =--- -=--=
http://arcgraph.de/sr/
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the firewire stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/firewire/fw-sbp2.c | 68 +++++++++++++++++--------------------
1 file changed, 33 insertions(+), 35 deletions(-)
Index: linux-2.6.27-rc2/drivers/firewire/fw-sbp2.c
===================================================================
--- linux-2.6.27-rc2.orig/drivers/firewire/fw-sbp2.c
+++ linux-2.6.27-rc2/drivers/firewire/fw-sbp2.c
@@ -29,6 +29,7 @@
*/
#include <linux/blkdev.h>
+#include <linux/bug.h>
#include <linux/delay.h>
#include <linux/device.h>
#include <linux/dma-mapping.h>
@@ -181,10 +182,16 @@ struct sbp2_target {
#define SBP2_MAX_LOGIN_ORB_TIMEOUT 40000U /* Timeout in ms */
#define SBP2_ORB_TIMEOUT 2000U /* Timeout in ms */
#define SBP2_ORB_NULL 0x80000000
-#define SBP2_MAX_SG_ELEMENT_LENGTH 0xf000
#define SBP2_RETRY_LIMIT 0xf /* 15 retries */
#define SBP2_CYCLE_LIMIT (0xc8 << 12) /* 200 125us cycles */
+/*
+ * The default maximum s/g segment size of a FireWire controller is
+ * usually 0x10000, but SBP-2 only allows 0xffff. Since buffers have to
+ * be quadlet-aligned, we set the length limit to 0xffff & ~3.
+ */
+#define SBP2_MAX_SEG_SIZE 0xfffc
+
/* Unit directory keys */
#define SBP2_CSR_UNIT_CHARACTERISTICS 0x3a
#define SBP2_CSR_FIRMWARE_REVISION 0x3c
@@ -1099,6 +1106,10 @@ static int sbp2_probe(struct device *dev
struct Scsi_Host *shost;
u32 model, firmware_revision;
+ if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE)
+ BUG_ON(dma_set_max_seg_size(device->card->device,
+ SBP2_MAX_SEG_SIZE));
+
shost = scsi_host_alloc(&scsi_driver_template, sizeof(*tgt));
if (shost == NULL)
return -ENOMEM;
@@ -1347,14 +1358,12 @@ static int
sbp2_map_scatterlist(struct sbp2_command_orb *orb, struct fw_device *device,
struct sbp2_logical_unit *lu)
{
- struct scatterlist *sg;
- int sg_len, l, i, j, count;
- dma_addr_t sg_addr;
-
- sg = scsi_sglist(orb->cmd);
- count = dma_map_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
- orb->cmd->sc_data_direction);
- if (count == 0)
+ struct scatterlist *sg = scsi_sglist(orb->cmd);
+ int i, j, len;
+
+ i = dma_map_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
+ orb->cmd->sc_data_direction);
+ if (i == 0)
goto fail;
/*
@@ -1364,7 +1373,7 @@ sbp2_map_scatterlist(struct sbp2_command
* as the second generation iPod which doesn't support page
* tables.
*/
- if (count == 1 && sg_dma_len(sg) < SBP2_MAX_SG_ELEMENT_LENGTH) {
+ if (i == 1) {
orb->request.data_descriptor.high =
cpu_to_be32(lu->tgt->address_high);
orb->request.data_descriptor.low =
@@ -1374,29 +1383,15 @@ sbp2_map_scatterlist(struct sbp2_command
return 0;
}
- /*
- * Convert the scatterlist to an sbp2 page table. If any
- * scatterlist entries are too big for sbp2, we split them as we
- * go. Even if we ask the block I/O layer to not give us sg
- * elements larger than 65535 bytes, some IOMMUs may merge sg elements
- * during DMA mapping, and Linux currently doesn't prevent this.
- */
- for (i = 0, j = 0; i < count; i++, sg = sg_next(sg)) {
- sg_len = sg_dma_len(sg);
- sg_addr = sg_dma_address(sg);
- while (sg_len) {
- /* FIXME: This won't get us out of the pinch. */
- if (unlikely(j >= ARRAY_SIZE(orb->page_table))) {
- fw_error("page table overflow\n");
- goto fail_page_table;
- }
- l = min(sg_len, SBP2_MAX_SG_ELEMENT_LENGTH);
- orb->page_table[j].low = cpu_to_be32(sg_addr);
- orb->page_table[j].high = cpu_to_be32(l << 16);
- sg_addr += l;
- sg_len -= l;
- j++;
- }
+ j = 0;
+ for_each_sg(sg, sg, scsi_sg_count(orb->cmd), i) {
+ len = sg_dma_len(sg);
+ if (len == 0)
+ continue;
+
+ orb->page_table[j].high = cpu_to_be32(len << 16);
+ orb->page_table[j].low = cpu_to_be32(sg_dma_address(sg));
+ j++;
}
orb->page_table_bus =
@@ -1420,8 +1415,9 @@ sbp2_map_scatterlist(struct sbp2_command
return 0;
fail_page_table:
- dma_unmap_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
- orb->cmd->sc_data_direction);
+ orb->page_table_bus = 0;
+ dma_unmap_sg(device->card->device, scsi_sglist(orb->cmd),
+ scsi_sg_count(orb->cmd), orb->cmd->sc_data_direction);
fail:
return -ENOMEM;
}
@@ -1542,6 +1538,8 @@ static int sbp2_scsi_slave_configure(str
if (lu->tgt->workarounds & SBP2_WORKAROUND_128K_MAX_TRANS)
blk_queue_max_sectors(sdev->request_queue, 128 * 1024 / 512);
+ blk_queue_max_segment_size(sdev->request_queue, SBP2_MAX_SEG_SIZE);
+
return 0;
}
--
Stefan Richter
-=====-==--- =--- -=--=
http://arcgraph.de/sr/
On Sat, Aug 9, 2008 at 11:21 AM, Stefan Richter
<[email protected]> wrote:
> 1. We don't need to round the SBP-2 segment size limit down to a
> multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
> ensure quadlet alignment (0xffff -> 0xfffc).
>
> 2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
> and the block IO layer about the restriction. This way we can
> remove the size checks and segment splitting in the queuecommand
> path.
>
> This assumes that no other code in the firewire stack uses
> dma_map_sg() with conflicting requirements. It furthermore assumes
> that the controller device's platform actually allows us to set the
> segment size to our liking. Assert the latter with a BUG_ON().
>
> 3. Also use blk_queue_max_segment_size() to tell the block IO layer
> about it. It cannot know it because our scsi_add_host() does not
> point to the FireWire controller's device.
>
> Signed-off-by: Stefan Richter <[email protected]>
> ---
> drivers/firewire/fw-sbp2.c | 68 +++++++++++++++++--------------------
> 1 file changed, 33 insertions(+), 35 deletions(-)
>
> Index: linux-2.6.27-rc2/drivers/firewire/fw-sbp2.c
> ===================================================================
> --- linux-2.6.27-rc2.orig/drivers/firewire/fw-sbp2.c
> +++ linux-2.6.27-rc2/drivers/firewire/fw-sbp2.c
> @@ -29,6 +29,7 @@
> */
>
> #include <linux/blkdev.h>
> +#include <linux/bug.h>
> #include <linux/delay.h>
> #include <linux/device.h>
> #include <linux/dma-mapping.h>
> @@ -181,10 +182,16 @@ struct sbp2_target {
> #define SBP2_MAX_LOGIN_ORB_TIMEOUT 40000U /* Timeout in ms */
> #define SBP2_ORB_TIMEOUT 2000U /* Timeout in ms */
> #define SBP2_ORB_NULL 0x80000000
> -#define SBP2_MAX_SG_ELEMENT_LENGTH 0xf000
> #define SBP2_RETRY_LIMIT 0xf /* 15 retries */
> #define SBP2_CYCLE_LIMIT (0xc8 << 12) /* 200 125us cycles */
>
> +/*
> + * The default maximum s/g segment size of a FireWire controller is
> + * usually 0x10000, but SBP-2 only allows 0xffff. Since buffers have to
> + * be quadlet-aligned, we set the length limit to 0xffff & ~3.
> + */
> +#define SBP2_MAX_SEG_SIZE 0xfffc
> +
> /* Unit directory keys */
> #define SBP2_CSR_UNIT_CHARACTERISTICS 0x3a
> #define SBP2_CSR_FIRMWARE_REVISION 0x3c
> @@ -1099,6 +1106,10 @@ static int sbp2_probe(struct device *dev
> struct Scsi_Host *shost;
> u32 model, firmware_revision;
>
> + if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE)
> + BUG_ON(dma_set_max_seg_size(device->card->device,
> + SBP2_MAX_SEG_SIZE));
> +
> shost = scsi_host_alloc(&scsi_driver_template, sizeof(*tgt));
> if (shost == NULL)
> return -ENOMEM;
> @@ -1347,14 +1358,12 @@ static int
> sbp2_map_scatterlist(struct sbp2_command_orb *orb, struct fw_device *device,
> struct sbp2_logical_unit *lu)
> {
> - struct scatterlist *sg;
> - int sg_len, l, i, j, count;
> - dma_addr_t sg_addr;
> -
> - sg = scsi_sglist(orb->cmd);
> - count = dma_map_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
> - orb->cmd->sc_data_direction);
> - if (count == 0)
> + struct scatterlist *sg = scsi_sglist(orb->cmd);
> + int i, j, len;
> +
> + i = dma_map_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
> + orb->cmd->sc_data_direction);
> + if (i == 0)
> goto fail;
>
> /*
> @@ -1364,7 +1373,7 @@ sbp2_map_scatterlist(struct sbp2_command
> * as the second generation iPod which doesn't support page
> * tables.
> */
> - if (count == 1 && sg_dma_len(sg) < SBP2_MAX_SG_ELEMENT_LENGTH) {
> + if (i == 1) {
> orb->request.data_descriptor.high =
> cpu_to_be32(lu->tgt->address_high);
> orb->request.data_descriptor.low =
> @@ -1374,29 +1383,15 @@ sbp2_map_scatterlist(struct sbp2_command
> return 0;
> }
>
> - /*
> - * Convert the scatterlist to an sbp2 page table. If any
> - * scatterlist entries are too big for sbp2, we split them as we
> - * go. Even if we ask the block I/O layer to not give us sg
> - * elements larger than 65535 bytes, some IOMMUs may merge sg elements
> - * during DMA mapping, and Linux currently doesn't prevent this.
> - */
> - for (i = 0, j = 0; i < count; i++, sg = sg_next(sg)) {
> - sg_len = sg_dma_len(sg);
> - sg_addr = sg_dma_address(sg);
> - while (sg_len) {
> - /* FIXME: This won't get us out of the pinch. */
> - if (unlikely(j >= ARRAY_SIZE(orb->page_table))) {
> - fw_error("page table overflow\n");
> - goto fail_page_table;
> - }
> - l = min(sg_len, SBP2_MAX_SG_ELEMENT_LENGTH);
> - orb->page_table[j].low = cpu_to_be32(sg_addr);
> - orb->page_table[j].high = cpu_to_be32(l << 16);
I didn't check the rest of the driver - but it would be good if it
explicitly called dma_set_mask() or pci_dma_set_mask() with a 32-bit
mask value. Most drivers assume 32-bit and that's why I point this
out.
The rest looks ok at first glance.
thanks,
grant
> - sg_addr += l;
> - sg_len -= l;
> - j++;
> - }
> + j = 0;
> + for_each_sg(sg, sg, scsi_sg_count(orb->cmd), i) {
> + len = sg_dma_len(sg);
> + if (len == 0)
> + continue;
> +
> + orb->page_table[j].high = cpu_to_be32(len << 16);
> + orb->page_table[j].low = cpu_to_be32(sg_dma_address(sg));
> + j++;
> }
>
> orb->page_table_bus =
> @@ -1420,8 +1415,9 @@ sbp2_map_scatterlist(struct sbp2_command
> return 0;
>
> fail_page_table:
> - dma_unmap_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
> - orb->cmd->sc_data_direction);
> + orb->page_table_bus = 0;
> + dma_unmap_sg(device->card->device, scsi_sglist(orb->cmd),
> + scsi_sg_count(orb->cmd), orb->cmd->sc_data_direction);
> fail:
> return -ENOMEM;
> }
> @@ -1542,6 +1538,8 @@ static int sbp2_scsi_slave_configure(str
> if (lu->tgt->workarounds & SBP2_WORKAROUND_128K_MAX_TRANS)
> blk_queue_max_sectors(sdev->request_queue, 128 * 1024 / 512);
>
> + blk_queue_max_segment_size(sdev->request_queue, SBP2_MAX_SEG_SIZE);
> +
> return 0;
> }
>
>
> --
> Stefan Richter
> -=====-==--- =--- -=--=
> http://arcgraph.de/sr/
>
>
On Sat, Aug 9, 2008 at 11:20 AM, Stefan Richter
<[email protected]> wrote:
> 1. We don't need to round the SBP-2 segment size limit down to a
> multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
> ensure quadlet alignment (0xffff -> 0xfffc).
>
> 2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
> and the block IO layer about the restriction. This way we can
> remove the size checks and segment splitting in the queuecommand
> path.
>
> This assumes that no other code in the ieee1394 stack uses
> dma_map_sg() with conflicting requirements. It furthermore assumes
> that the controller device's platform actually allows us to set the
> segment size to our liking. Assert the latter with a BUG_ON().
>
> 3. Also use blk_queue_max_segment_size() to tell the block IO layer
> about it. It cannot know it because our scsi_add_host() does not
> point to the FireWire controller's device.
>
> We can also uniformly use dma_map_sg() for the single segment case just
> like for the multi segment case, to further simplify the code.
>
> Also clean up how the page table is converted to big endian.
>
> Signed-off-by: Stefan Richter <[email protected]>
> ---
>
> Applicable after
> [PATCH update] ieee1394: sbp2: stricter dma_sync
> [PATCH] ieee1394: sbp2: check for DMA mapping failures
> from a few minutes ago.
>
> drivers/ieee1394/sbp2.c | 106 +++++++++++++---------------------------
> drivers/ieee1394/sbp2.h | 37 +++++--------
> 2 files changed, 52 insertions(+), 91 deletions(-)
...
> + for_each_sg(sg, sg, sg_count, i) {
> + len = sg_dma_len(sg);
> + if (len == 0)
> + continue;
>
> - orb->misc |= ORB_SET_DATA_SIZE(sg_count);
> + pt[j].high = cpu_to_be32(len << 16);
> + pt[j].low = cpu_to_be32(sg_dma_address(sg));
> + j++;
> + }
While this code will probably work correctly, I believe if the
sg_dma_len() returns 0, one has walked off the end of the
"bus address" list.
pci_map_sg() returns how many DMA addr/len pairs are used
and that count could (should?) be used when pulling the
dma_len/addr pairs out of the sg list.
Effectively, the SG list is really two lists with seperate counts:
the original list with virtual addresses/len/count and the "DMA mapped"
list with it's own count of addresses/len pairs. When no IOMMU
is used those lists are 1:1.
hth,
grant
> - sbp2util_cpu_to_be32_buffer(sg_element,
> - (sizeof(struct sbp2_unrestricted_page_table)) *
> - sg_count);
> + orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1) |
> + ORB_SET_DATA_SIZE(j);
> + orb->data_descriptor_lo = cmd->sge_dma;
>
> dma_sync_single_for_device(dmadev, cmd->sge_dma,
> sizeof(cmd->scatter_gather_element),
> @@ -2037,6 +2003,8 @@ static int sbp2scsi_slave_configure(stru
> sdev->start_stop_pwr_cond = 1;
> if (lu->workarounds & SBP2_WORKAROUND_128K_MAX_TRANS)
> blk_queue_max_sectors(sdev->request_queue, 128 * 1024 / 512);
> +
> + blk_queue_max_segment_size(sdev->request_queue, SBP2_MAX_SEG_SIZE);
> return 0;
> }
>
>
> --
> Stefan Richter
> -=====-==--- =--- -=--=
> http://arcgraph.de/sr/
>
>
On Tue, 12 Aug 2008 10:04:14 -0700
"Grant Grundler" <[email protected]> wrote:
> On Sat, Aug 9, 2008 at 11:20 AM, Stefan Richter
> <[email protected]> wrote:
> > 1. We don't need to round the SBP-2 segment size limit down to a
> > multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
> > ensure quadlet alignment (0xffff -> 0xfffc).
> >
> > 2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
> > and the block IO layer about the restriction. This way we can
> > remove the size checks and segment splitting in the queuecommand
> > path.
> >
> > This assumes that no other code in the ieee1394 stack uses
> > dma_map_sg() with conflicting requirements. It furthermore assumes
> > that the controller device's platform actually allows us to set the
> > segment size to our liking. Assert the latter with a BUG_ON().
> >
> > 3. Also use blk_queue_max_segment_size() to tell the block IO layer
> > about it. It cannot know it because our scsi_add_host() does not
> > point to the FireWire controller's device.
> >
> > We can also uniformly use dma_map_sg() for the single segment case just
> > like for the multi segment case, to further simplify the code.
> >
> > Also clean up how the page table is converted to big endian.
> >
> > Signed-off-by: Stefan Richter <[email protected]>
> > ---
> >
> > Applicable after
> > [PATCH update] ieee1394: sbp2: stricter dma_sync
> > [PATCH] ieee1394: sbp2: check for DMA mapping failures
> > from a few minutes ago.
> >
> > drivers/ieee1394/sbp2.c | 106 +++++++++++++---------------------------
> > drivers/ieee1394/sbp2.h | 37 +++++--------
> > 2 files changed, 52 insertions(+), 91 deletions(-)
> ...
> > + for_each_sg(sg, sg, sg_count, i) {
> > + len = sg_dma_len(sg);
> > + if (len == 0)
> > + continue;
> >
> > - orb->misc |= ORB_SET_DATA_SIZE(sg_count);
> > + pt[j].high = cpu_to_be32(len << 16);
> > + pt[j].low = cpu_to_be32(sg_dma_address(sg));
> > + j++;
> > + }
>
> While this code will probably work correctly, I believe if the
> sg_dma_len() returns 0, one has walked off the end of the
> "bus address" list.
>
> pci_map_sg() returns how many DMA addr/len pairs are used
> and that count could (should?) be used when pulling the
> dma_len/addr pairs out of the sg list.
Yeah, from a quick look, seems that this patch wrongly handles
sg_count.
This patch sets scsi_sg_count(sc) to sg_count, right? for_each_sg is
expected to be used with a return value of pci_map_sg.
Then this patch can simply do something like:
for_each_sg(sg, sg, sg_count, i) {
pt[i].high = cpu_to_be32(sg_dma_len(sg) << 16);
pt[i].low = cpu_to_be32(sg_dma_address(sg));
}
Grant Grundler wrote:
> On Sat, Aug 9, 2008 at 11:21 AM, Stefan Richter
>> - orb->page_table[j].low = cpu_to_be32(sg_addr);
>> - orb->page_table[j].high = cpu_to_be32(l << 16);
>
> I didn't check the rest of the driver - but it would be good if it
> explicitly called dma_set_mask() or pci_dma_set_mask() with a 32-bit
> mask value. Most drivers assume 32-bit and that's why I point this
> out.
I thought about this and currently think that I should not do this.
There is the API to set the mask, but there is no API to _decrease_ the
mask only. The SBP-2 protocol driver should not set DMA_32BIT_MASK if
any other driver already set for example DMA_31BIT_MASK for whatever reason.
For now, this is no issue. FireWire low-level drivers exist only for
controllers with 32 bit local bus addressing capability. I am sure that
somebody will remember to modify the SBP-2 driver(s) if there will ever
be a controller type and a driver for it which can address more than 4 GB.
On the other hand, we also currently have no reason to set a smaller
mask. We recently discovered a chip bug which requires DMA_31BIT_MASK
for a special mode of data reception (only for cache-coherent DMA
though), but we now work around this chip bug by simply falling back to
an alternative mode.
--
Stefan Richter
-=====-==--- =--- -==-=
http://arcgraph.de/sr/
On 13 Aug, FUJITA Tomonori wrote:
> On Tue, 12 Aug 2008 10:04:14 -0700
> "Grant Grundler" <[email protected]> wrote:
>
>> On Sat, Aug 9, 2008 at 11:20 AM, Stefan Richter
>> <[email protected]> wrote:
>> > drivers/ieee1394/sbp2.c | 106 +++++++++++++---------------------------
>> > drivers/ieee1394/sbp2.h | 37 +++++--------
>> > 2 files changed, 52 insertions(+), 91 deletions(-)
>> ...
>> > + for_each_sg(sg, sg, sg_count, i) {
>> > + len = sg_dma_len(sg);
>> > + if (len == 0)
>> > + continue;
>> >
>> > - orb->misc |= ORB_SET_DATA_SIZE(sg_count);
>> > + pt[j].high = cpu_to_be32(len << 16);
>> > + pt[j].low = cpu_to_be32(sg_dma_address(sg));
>> > + j++;
>> > + }
>>
>> While this code will probably work correctly, I believe if the
>> sg_dma_len() returns 0, one has walked off the end of the
>> "bus address" list.
>>
>> pci_map_sg() returns how many DMA addr/len pairs are used
>> and that count could (should?) be used when pulling the
>> dma_len/addr pairs out of the sg list.
>
> Yeah, from a quick look, seems that this patch wrongly handles
> sg_count.
>
> This patch sets scsi_sg_count(sc) to sg_count, right?
Well, I keep the original scsi_sg_count to call dma_unmap_sg with it
later. According to Corbet/ Rubini/ Kroah-Hartman: LDD3, dma_unmap_sg
is to be called with the number of s/g elements before DMA-mapping.
>From a quick look at some dma_unmap_sg implementations, this seems still
to be the case; or it doesn't actually matter.
> for_each_sg is expected to be used with a return value of pci_map_sg.
>
> Then this patch can simply do something like:
>
> for_each_sg(sg, sg, sg_count, i) {
> pt[i].high = cpu_to_be32(sg_dma_len(sg) << 16);
> pt[i].low = cpu_to_be32(sg_dma_address(sg));
> }
Thanks a lot for looking at this; here is my patch fixed.
From: Stefan Richter <[email protected]>
Subject: ieee1394: sbp2: enforce s/g segment size limit
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the ieee1394 stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
We can also uniformly use dma_map_sg() for the single segment case just
like for the multi segment case, to further simplify the code.
Also clean up how the page table is converted to big endian.
Thanks to Grant Grundler and FUJITA Tomonori for advice.
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/ieee1394/sbp2.c | 100 ++++++++++++----------------------------
drivers/ieee1394/sbp2.h | 37 ++++++--------
2 files changed, 46 insertions(+), 91 deletions(-)
Index: linux/drivers/ieee1394/sbp2.h
===================================================================
--- linux.orig/drivers/ieee1394/sbp2.h
+++ linux/drivers/ieee1394/sbp2.h
@@ -139,13 +139,10 @@ struct sbp2_logout_orb {
u32 status_fifo_lo;
} __attribute__((packed));
-#define PAGE_TABLE_SET_SEGMENT_BASE_HI(v) ((v) & 0xffff)
-#define PAGE_TABLE_SET_SEGMENT_LENGTH(v) (((v) & 0xffff) << 16)
-
struct sbp2_unrestricted_page_table {
- u32 length_segment_base_hi;
- u32 segment_base_lo;
-} __attribute__((packed));
+ __be32 high;
+ __be32 low;
+};
#define RESP_STATUS_REQUEST_COMPLETE 0x0
#define RESP_STATUS_TRANSPORT_FAILURE 0x1
@@ -216,15 +213,18 @@ struct sbp2_status_block {
#define SBP2_UNIT_SPEC_ID_ENTRY 0x0000609e
#define SBP2_SW_VERSION_ENTRY 0x00010483
-
/*
- * SCSI specific definitions
+ * The default maximum s/g segment size of a FireWire controller is
+ * usually 0x10000, but SBP-2 only allows 0xffff. Since buffers have to
+ * be quadlet-aligned, we set the length limit to 0xffff & ~3.
*/
+#define SBP2_MAX_SEG_SIZE 0xfffc
-#define SBP2_MAX_SG_ELEMENT_LENGTH 0xf000
-/* There is no real limitation of the queue depth (i.e. length of the linked
+/*
+ * There is no real limitation of the queue depth (i.e. length of the linked
* list of command ORBs) at the target. The chosen depth is merely an
- * implementation detail of the sbp2 driver. */
+ * implementation detail of the sbp2 driver.
+ */
#define SBP2_MAX_CMDS 8
#define SBP2_SCSI_STATUS_GOOD 0x0
@@ -240,12 +240,6 @@ struct sbp2_status_block {
* Representations of commands and devices
*/
-enum sbp2_dma_types {
- CMD_DMA_NONE,
- CMD_DMA_PAGE,
- CMD_DMA_SINGLE
-};
-
/* Per SCSI command */
struct sbp2_command_info {
struct list_head list;
@@ -258,11 +252,10 @@ struct sbp2_command_info {
struct sbp2_unrestricted_page_table
scatter_gather_element[SG_ALL] __attribute__((aligned(8)));
dma_addr_t sge_dma;
- void *sge_buffer;
- dma_addr_t cmd_dma;
- enum sbp2_dma_types dma_type;
- unsigned long dma_size;
- enum dma_data_direction dma_dir;
+
+ struct scatterlist *sg_buffer;
+ int sg_count;
+ enum dma_data_direction sg_dir;
};
/* Per FireWire host */
Index: linux/drivers/ieee1394/sbp2.c
===================================================================
--- linux.orig/drivers/ieee1394/sbp2.c
+++ linux/drivers/ieee1394/sbp2.c
@@ -656,23 +656,10 @@ static struct sbp2_command_info *sbp2uti
static void sbp2util_mark_command_completed(struct sbp2_lu *lu,
struct sbp2_command_info *cmd)
{
- struct hpsb_host *host = lu->ud->ne->host;
-
- if (cmd->cmd_dma) {
- if (cmd->dma_type == CMD_DMA_SINGLE)
- dma_unmap_single(host->device.parent, cmd->cmd_dma,
- cmd->dma_size, cmd->dma_dir);
- else if (cmd->dma_type == CMD_DMA_PAGE)
- dma_unmap_page(host->device.parent, cmd->cmd_dma,
- cmd->dma_size, cmd->dma_dir);
- /* XXX: Check for CMD_DMA_NONE bug */
- cmd->dma_type = CMD_DMA_NONE;
- cmd->cmd_dma = 0;
- }
- if (cmd->sge_buffer) {
- dma_unmap_sg(host->device.parent, cmd->sge_buffer,
- cmd->dma_size, cmd->dma_dir);
- cmd->sge_buffer = NULL;
+ if (cmd->sg_buffer) {
+ dma_unmap_sg(lu->ud->ne->host->device.parent, cmd->sg_buffer,
+ cmd->sg_count, cmd->sg_dir);
+ cmd->sg_buffer = NULL;
}
list_move_tail(&cmd->list, &lu->cmd_orb_completed);
}
@@ -827,6 +814,10 @@ static struct sbp2_lu *sbp2_alloc_device
#endif
}
+ if (dma_get_max_seg_size(hi->host->device.parent) > SBP2_MAX_SEG_SIZE)
+ BUG_ON(dma_set_max_seg_size(hi->host->device.parent,
+ SBP2_MAX_SEG_SIZE));
+
/* Prevent unloading of the 1394 host */
if (!try_module_get(hi->host->driver->owner)) {
SBP2_ERR("failed to get a reference on 1394 host driver");
@@ -1501,76 +1492,45 @@ static int sbp2_agent_reset(struct sbp2_
static int sbp2_prep_command_orb_sg(struct sbp2_command_orb *orb,
struct sbp2_fwhost_info *hi,
struct sbp2_command_info *cmd,
- unsigned int scsi_use_sg,
+ unsigned int sg_count,
struct scatterlist *sg,
u32 orb_direction,
enum dma_data_direction dma_dir)
{
struct device *dmadev = hi->host->device.parent;
+ struct sbp2_unrestricted_page_table *pt;
+ int i, n;
+
+ n = dma_map_sg(dmadev, sg, sg_count, dma_dir);
+ if (n == 0)
+ return -ENOMEM;
+
+ cmd->sg_buffer = sg;
+ cmd->sg_count = sg_count;
+ cmd->sg_dir = dma_dir;
- cmd->dma_dir = dma_dir;
orb->data_descriptor_hi = ORB_SET_NODE_ID(hi->host->node_id);
orb->misc |= ORB_SET_DIRECTION(orb_direction);
/* special case if only one element (and less than 64KB in size) */
- if (scsi_use_sg == 1 && sg->length <= SBP2_MAX_SG_ELEMENT_LENGTH) {
-
- cmd->dma_size = sg->length;
- cmd->dma_type = CMD_DMA_PAGE;
- cmd->cmd_dma = dma_map_page(dmadev, sg_page(sg), sg->offset,
- cmd->dma_size, cmd->dma_dir);
- if (dma_mapping_error(cmd->cmd_dma)) {
- cmd->cmd_dma = 0;
- return -ENOMEM;
- }
-
- orb->data_descriptor_lo = cmd->cmd_dma;
- orb->misc |= ORB_SET_DATA_SIZE(cmd->dma_size);
-
+ if (n == 1) {
+ orb->misc |= ORB_SET_DATA_SIZE(sg_dma_len(sg));
+ orb->data_descriptor_lo = sg_dma_address(sg);
} else {
- struct sbp2_unrestricted_page_table *sg_element =
- &cmd->scatter_gather_element[0];
- u32 sg_count, sg_len;
- dma_addr_t sg_addr;
- int i, count = dma_map_sg(dmadev, sg, scsi_use_sg, dma_dir);
-
- cmd->dma_size = scsi_use_sg;
- cmd->sge_buffer = sg;
-
- /* use page tables (s/g) */
- orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1);
- orb->data_descriptor_lo = cmd->sge_dma;
+ pt = &cmd->scatter_gather_element[0];
dma_sync_single_for_cpu(dmadev, cmd->sge_dma,
sizeof(cmd->scatter_gather_element),
DMA_TO_DEVICE);
- /* loop through and fill out our SBP-2 page tables
- * (and split up anything too large) */
- for (i = 0, sg_count = 0; i < count; i++, sg = sg_next(sg)) {
- sg_len = sg_dma_len(sg);
- sg_addr = sg_dma_address(sg);
- while (sg_len) {
- sg_element[sg_count].segment_base_lo = sg_addr;
- if (sg_len > SBP2_MAX_SG_ELEMENT_LENGTH) {
- sg_element[sg_count].length_segment_base_hi =
- PAGE_TABLE_SET_SEGMENT_LENGTH(SBP2_MAX_SG_ELEMENT_LENGTH);
- sg_addr += SBP2_MAX_SG_ELEMENT_LENGTH;
- sg_len -= SBP2_MAX_SG_ELEMENT_LENGTH;
- } else {
- sg_element[sg_count].length_segment_base_hi =
- PAGE_TABLE_SET_SEGMENT_LENGTH(sg_len);
- sg_len = 0;
- }
- sg_count++;
- }
+ for_each_sg(sg, sg, n, i) {
+ pt[i].high = cpu_to_be32(sg_dma_len(sg) << 16);
+ pt[i].low = cpu_to_be32(sg_dma_address(sg));
}
- orb->misc |= ORB_SET_DATA_SIZE(sg_count);
-
- sbp2util_cpu_to_be32_buffer(sg_element,
- (sizeof(struct sbp2_unrestricted_page_table)) *
- sg_count);
+ orb->misc |= ORB_SET_PAGE_TABLE_PRESENT(0x1) |
+ ORB_SET_DATA_SIZE(n);
+ orb->data_descriptor_lo = cmd->sge_dma;
dma_sync_single_for_device(dmadev, cmd->sge_dma,
sizeof(cmd->scatter_gather_element),
@@ -2037,6 +1997,8 @@ static int sbp2scsi_slave_configure(stru
sdev->start_stop_pwr_cond = 1;
if (lu->workarounds & SBP2_WORKAROUND_128K_MAX_TRANS)
blk_queue_max_sectors(sdev->request_queue, 128 * 1024 / 512);
+
+ blk_queue_max_segment_size(sdev->request_queue, SBP2_MAX_SEG_SIZE);
return 0;
}
--
Stefan Richter
-=====-==--- =--- -==-=
http://arcgraph.de/sr/
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the firewire stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
Thanks to Grant Grundler and FUJITA Tomonori for advice.
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/firewire/fw-sbp2.c | 64 ++++++++++++++++---------------------
1 file changed, 28 insertions(+), 36 deletions(-)
Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -29,6 +29,7 @@
*/
#include <linux/blkdev.h>
+#include <linux/bug.h>
#include <linux/delay.h>
#include <linux/device.h>
#include <linux/dma-mapping.h>
@@ -181,10 +182,16 @@ struct sbp2_target {
#define SBP2_MAX_LOGIN_ORB_TIMEOUT 40000U /* Timeout in ms */
#define SBP2_ORB_TIMEOUT 2000U /* Timeout in ms */
#define SBP2_ORB_NULL 0x80000000
-#define SBP2_MAX_SG_ELEMENT_LENGTH 0xf000
#define SBP2_RETRY_LIMIT 0xf /* 15 retries */
#define SBP2_CYCLE_LIMIT (0xc8 << 12) /* 200 125us cycles */
+/*
+ * The default maximum s/g segment size of a FireWire controller is
+ * usually 0x10000, but SBP-2 only allows 0xffff. Since buffers have to
+ * be quadlet-aligned, we set the length limit to 0xffff & ~3.
+ */
+#define SBP2_MAX_SEG_SIZE 0xfffc
+
/* Unit directory keys */
#define SBP2_CSR_UNIT_CHARACTERISTICS 0x3a
#define SBP2_CSR_FIRMWARE_REVISION 0x3c
@@ -1099,6 +1106,10 @@ static int sbp2_probe(struct device *dev
struct Scsi_Host *shost;
u32 model, firmware_revision;
+ if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE)
+ BUG_ON(dma_set_max_seg_size(device->card->device,
+ SBP2_MAX_SEG_SIZE));
+
shost = scsi_host_alloc(&scsi_driver_template, sizeof(*tgt));
if (shost == NULL)
return -ENOMEM;
@@ -1347,14 +1358,12 @@ static int
sbp2_map_scatterlist(struct sbp2_command_orb *orb, struct fw_device *device,
struct sbp2_logical_unit *lu)
{
- struct scatterlist *sg;
- int sg_len, l, i, j, count;
- dma_addr_t sg_addr;
-
- sg = scsi_sglist(orb->cmd);
- count = dma_map_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
- orb->cmd->sc_data_direction);
- if (count == 0)
+ struct scatterlist *sg = scsi_sglist(orb->cmd);
+ int i, n;
+
+ n = dma_map_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
+ orb->cmd->sc_data_direction);
+ if (n == 0)
goto fail;
/*
@@ -1364,7 +1373,7 @@ sbp2_map_scatterlist(struct sbp2_command
* as the second generation iPod which doesn't support page
* tables.
*/
- if (count == 1 && sg_dma_len(sg) < SBP2_MAX_SG_ELEMENT_LENGTH) {
+ if (n == 1) {
orb->request.data_descriptor.high =
cpu_to_be32(lu->tgt->address_high);
orb->request.data_descriptor.low =
@@ -1374,29 +1383,9 @@ sbp2_map_scatterlist(struct sbp2_command
return 0;
}
- /*
- * Convert the scatterlist to an sbp2 page table. If any
- * scatterlist entries are too big for sbp2, we split them as we
- * go. Even if we ask the block I/O layer to not give us sg
- * elements larger than 65535 bytes, some IOMMUs may merge sg elements
- * during DMA mapping, and Linux currently doesn't prevent this.
- */
- for (i = 0, j = 0; i < count; i++, sg = sg_next(sg)) {
- sg_len = sg_dma_len(sg);
- sg_addr = sg_dma_address(sg);
- while (sg_len) {
- /* FIXME: This won't get us out of the pinch. */
- if (unlikely(j >= ARRAY_SIZE(orb->page_table))) {
- fw_error("page table overflow\n");
- goto fail_page_table;
- }
- l = min(sg_len, SBP2_MAX_SG_ELEMENT_LENGTH);
- orb->page_table[j].low = cpu_to_be32(sg_addr);
- orb->page_table[j].high = cpu_to_be32(l << 16);
- sg_addr += l;
- sg_len -= l;
- j++;
- }
+ for_each_sg(sg, sg, n, i) {
+ orb->page_table[i].high = cpu_to_be32(sg_dma_len(sg) << 16);
+ orb->page_table[i].low = cpu_to_be32(sg_dma_address(sg));
}
orb->page_table_bus =
@@ -1415,13 +1404,14 @@ sbp2_map_scatterlist(struct sbp2_command
orb->request.data_descriptor.high = cpu_to_be32(lu->tgt->address_high);
orb->request.data_descriptor.low = cpu_to_be32(orb->page_table_bus);
orb->request.misc |= cpu_to_be32(COMMAND_ORB_PAGE_TABLE_PRESENT |
- COMMAND_ORB_DATA_SIZE(j));
+ COMMAND_ORB_DATA_SIZE(n));
return 0;
fail_page_table:
- dma_unmap_sg(device->card->device, sg, scsi_sg_count(orb->cmd),
- orb->cmd->sc_data_direction);
+ orb->page_table_bus = 0;
+ dma_unmap_sg(device->card->device, scsi_sglist(orb->cmd),
+ scsi_sg_count(orb->cmd), orb->cmd->sc_data_direction);
fail:
return -ENOMEM;
}
@@ -1542,6 +1532,8 @@ static int sbp2_scsi_slave_configure(str
if (lu->tgt->workarounds & SBP2_WORKAROUND_128K_MAX_TRANS)
blk_queue_max_sectors(sdev->request_queue, 128 * 1024 / 512);
+ blk_queue_max_segment_size(sdev->request_queue, SBP2_MAX_SEG_SIZE);
+
return 0;
}
--
Stefan Richter
-=====-==--- =--- -==-=
http://arcgraph.de/sr/
I wrote:
> - if (dma_mapping_error(cmd->cmd_dma)) {
PS: That's the 2.6.26 version; the 2.6.27 patch has an additional
argument here, if anybody cares...
--
Stefan Richter
-=====-==--- =--- -==-=
http://arcgraph.de/sr/
On Wed, 13 Aug 2008 12:19:59 +0200 (CEST)
Stefan Richter <[email protected]> wrote:
> On 13 Aug, FUJITA Tomonori wrote:
> > On Tue, 12 Aug 2008 10:04:14 -0700
> > "Grant Grundler" <[email protected]> wrote:
> >
> >> On Sat, Aug 9, 2008 at 11:20 AM, Stefan Richter
> >> <[email protected]> wrote:
> >> > drivers/ieee1394/sbp2.c | 106 +++++++++++++---------------------------
> >> > drivers/ieee1394/sbp2.h | 37 +++++--------
> >> > 2 files changed, 52 insertions(+), 91 deletions(-)
> >> ...
> >> > + for_each_sg(sg, sg, sg_count, i) {
> >> > + len = sg_dma_len(sg);
> >> > + if (len == 0)
> >> > + continue;
> >> >
> >> > - orb->misc |= ORB_SET_DATA_SIZE(sg_count);
> >> > + pt[j].high = cpu_to_be32(len << 16);
> >> > + pt[j].low = cpu_to_be32(sg_dma_address(sg));
> >> > + j++;
> >> > + }
> >>
> >> While this code will probably work correctly, I believe if the
> >> sg_dma_len() returns 0, one has walked off the end of the
> >> "bus address" list.
> >>
> >> pci_map_sg() returns how many DMA addr/len pairs are used
> >> and that count could (should?) be used when pulling the
> >> dma_len/addr pairs out of the sg list.
> >
> > Yeah, from a quick look, seems that this patch wrongly handles
> > sg_count.
> >
> > This patch sets scsi_sg_count(sc) to sg_count, right?
>
> Well, I keep the original scsi_sg_count to call dma_unmap_sg with it
> later.
You need to keep it? You can access to it via scsi_sg_count when
calling dma_unmap_sg? Seems that firewire code does.
> According to Corbet/ Rubini/ Kroah-Hartman: LDD3, dma_unmap_sg
> is to be called with the number of s/g elements before DMA-mapping.
Are you talking about?
=
To unmap a scatterlist, just call:
pci_unmap_sg(pdev, sglist, nents, direction);
Again, make sure DMA activity has already finished.
PLEASE NOTE: The 'nents' argument to the pci_unmap_sg call must be
the _same_ one you passed into the pci_map_sg call,
it should _NOT_ be the 'count' value _returned_ from the
pci_map_sg call.
This is from Documentation/DMA-mapping.txt
> From a quick look at some dma_unmap_sg implementations, this seems still
> to be the case; or it doesn't actually matter.
I think that all the IOMMUs are fine even if you call dma_unmap_sg
with a 'nents' value that dma_map_sg returned. But, well, it's against
the rule.
On 14 Aug, FUJITA Tomonori wrote:
> On Wed, 13 Aug 2008 12:19:59 +0200 (CEST)
> Stefan Richter <[email protected]> wrote:
>> I keep the original scsi_sg_count to call dma_unmap_sg with it
>> later.
>
> You need to keep it? You can access to it via scsi_sg_count when
> calling dma_unmap_sg? Seems that firewire code does.
Yes, I can easily do it this way; fixed below.
[...nents in dma_unmap_sg...]
> This is from Documentation/DMA-mapping.txt
Right. How did I forget about that?
From: Stefan Richter <[email protected]>
Subject: ieee1394: sbp2: remove redundant struct members
Signed-off-by: Stefan Richter <[email protected]>
---
drivers/ieee1394/sbp2.c | 14 +++++---------
drivers/ieee1394/sbp2.h | 4 ----
2 files changed, 5 insertions(+), 13 deletions(-)
Index: linux/drivers/ieee1394/sbp2.c
===================================================================
--- linux.orig/drivers/ieee1394/sbp2.c
+++ linux/drivers/ieee1394/sbp2.c
@@ -656,11 +656,11 @@ static struct sbp2_command_info *sbp2uti
static void sbp2util_mark_command_completed(struct sbp2_lu *lu,
struct sbp2_command_info *cmd)
{
- if (cmd->sg_buffer) {
- dma_unmap_sg(lu->ud->ne->host->device.parent, cmd->sg_buffer,
- cmd->sg_count, cmd->sg_dir);
- cmd->sg_buffer = NULL;
- }
+ if (scsi_sg_count(cmd->Current_SCpnt) > 0)
+ dma_unmap_sg(lu->ud->ne->host->device.parent,
+ scsi_sglist(cmd->Current_SCpnt),
+ scsi_sg_count(cmd->Current_SCpnt),
+ cmd->Current_SCpnt->sc_data_direction);
list_move_tail(&cmd->list, &lu->cmd_orb_completed);
}
@@ -1505,10 +1505,6 @@ static int sbp2_prep_command_orb_sg(stru
if (n == 0)
return -ENOMEM;
- cmd->sg_buffer = sg;
- cmd->sg_count = sg_count;
- cmd->sg_dir = dma_dir;
-
orb->data_descriptor_hi = ORB_SET_NODE_ID(hi->host->node_id);
orb->misc |= ORB_SET_DIRECTION(orb_direction);
Index: linux/drivers/ieee1394/sbp2.h
===================================================================
--- linux.orig/drivers/ieee1394/sbp2.h
+++ linux/drivers/ieee1394/sbp2.h
@@ -252,10 +252,6 @@ struct sbp2_command_info {
struct sbp2_unrestricted_page_table
scatter_gather_element[SG_ALL] __attribute__((aligned(8)));
dma_addr_t sge_dma;
-
- struct scatterlist *sg_buffer;
- int sg_count;
- enum dma_data_direction sg_dir;
};
/* Per FireWire host */
--
Stefan Richter
-=====-==--- =--- -===-
http://arcgraph.de/sr/
On Thu, 14 Aug 2008 09:12:01 +0200 (CEST)
Stefan Richter <[email protected]> wrote:
> On 14 Aug, FUJITA Tomonori wrote:
> > On Wed, 13 Aug 2008 12:19:59 +0200 (CEST)
> > Stefan Richter <[email protected]> wrote:
> >> I keep the original scsi_sg_count to call dma_unmap_sg with it
> >> later.
> >
> > You need to keep it? You can access to it via scsi_sg_count when
> > calling dma_unmap_sg? Seems that firewire code does.
>
> Yes, I can easily do it this way; fixed below.
>
> [...nents in dma_unmap_sg...]
> > This is from Documentation/DMA-mapping.txt
>
> Right. How did I forget about that?
>
>
>
> From: Stefan Richter <[email protected]>
> Subject: ieee1394: sbp2: remove redundant struct members
>
> Signed-off-by: Stefan Richter <[email protected]>
> ---
> drivers/ieee1394/sbp2.c | 14 +++++---------
> drivers/ieee1394/sbp2.h | 4 ----
> 2 files changed, 5 insertions(+), 13 deletions(-)
>
> Index: linux/drivers/ieee1394/sbp2.c
> ===================================================================
> --- linux.orig/drivers/ieee1394/sbp2.c
> +++ linux/drivers/ieee1394/sbp2.c
> @@ -656,11 +656,11 @@ static struct sbp2_command_info *sbp2uti
> static void sbp2util_mark_command_completed(struct sbp2_lu *lu,
> struct sbp2_command_info *cmd)
> {
> - if (cmd->sg_buffer) {
> - dma_unmap_sg(lu->ud->ne->host->device.parent, cmd->sg_buffer,
> - cmd->sg_count, cmd->sg_dir);
> - cmd->sg_buffer = NULL;
> - }
> + if (scsi_sg_count(cmd->Current_SCpnt) > 0)
static inline unsigned scsi_sg_count(struct scsi_cmnd *cmd)
so
> + if (scsi_sg_count(cmd->Current_SCpnt))
make more sense, I think. It looks a nice cleanup except for it.
> + dma_unmap_sg(lu->ud->ne->host->device.parent,
> + scsi_sglist(cmd->Current_SCpnt),
> + scsi_sg_count(cmd->Current_SCpnt),
> + cmd->Current_SCpnt->sc_data_direction);
> list_move_tail(&cmd->list, &lu->cmd_orb_completed);
> }
>
> @@ -1505,10 +1505,6 @@ static int sbp2_prep_command_orb_sg(stru
> if (n == 0)
> return -ENOMEM;
>
> - cmd->sg_buffer = sg;
> - cmd->sg_count = sg_count;
> - cmd->sg_dir = dma_dir;
> -
> orb->data_descriptor_hi = ORB_SET_NODE_ID(hi->host->node_id);
> orb->misc |= ORB_SET_DIRECTION(orb_direction);
>
> Index: linux/drivers/ieee1394/sbp2.h
> ===================================================================
> --- linux.orig/drivers/ieee1394/sbp2.h
> +++ linux/drivers/ieee1394/sbp2.h
> @@ -252,10 +252,6 @@ struct sbp2_command_info {
> struct sbp2_unrestricted_page_table
> scatter_gather_element[SG_ALL] __attribute__((aligned(8)));
> dma_addr_t sge_dma;
> -
> - struct scatterlist *sg_buffer;
> - int sg_count;
> - enum dma_data_direction sg_dir;
> };
>
> /* Per FireWire host */
>
>
> --
> Stefan Richter
> -=====-==--- =--- -===-
> http://arcgraph.de/sr/
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/