2022-06-27 15:34:04

by John Garry

[permalink] [raw]
Subject: [PATCH v4 0/5] DMA mapping changes for SCSI core

As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching
limit may see a big performance hit.

This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
that drivers may know this limit when performance is a factor in the
mapping.

The SCSI SAS transport code is modified only to use this limit. For now I
did not want to touch other hosts as I have a concern that this change
could cause a performance regression.

I also added a patch for libata-scsi as it does not currently honour the
shost max_sectors limit.

[0] https://lore.kernel.org/linux-iommu/[email protected]/
[1] https://lore.kernel.org/linux-iommu/[email protected]/

Changes since v3:
- Apply max DMA optimial limit to SAS hosts only
Note: Even though "scsi: core: Cap shost max_sectors only once when
adding" is a subset of a previous patch I did not transfer the RB tags
- Rebase on v5.19-rc4

Changes since v2:
- Rebase on v5.19-rc1
- Add Damien's tag to 2/4 (thanks)

Changes since v1:
- Relocate scsi_add_host_with_dma() dma_dev check (Reported by Dan)
- Add tags from Damien and Martin (thanks)
- note: I only added Martin's tag to the SCSI patch

John Garry (5):
dma-mapping: Add dma_opt_mapping_size()
dma-iommu: Add iommu_dma_opt_mapping_size()
scsi: core: Cap shost max_sectors according to DMA mapping limits only
once
scsi: scsi_transport_sas: Cap shost max_sectors according to DMA
optimal mapping limit
libata-scsi: Cap ata_device->max_sectors according to
shost->max_sectors

Documentation/core-api/dma-api.rst | 9 +++++++++
drivers/ata/libata-scsi.c | 1 +
drivers/iommu/dma-iommu.c | 6 ++++++
drivers/iommu/iova.c | 5 +++++
drivers/scsi/hosts.c | 5 +++++
drivers/scsi/scsi_lib.c | 4 ----
drivers/scsi/scsi_transport_sas.c | 6 ++++++
include/linux/dma-map-ops.h | 1 +
include/linux/dma-mapping.h | 5 +++++
include/linux/iova.h | 2 ++
kernel/dma/mapping.c | 12 ++++++++++++
11 files changed, 52 insertions(+), 4 deletions(-)

--
2.35.3


2022-06-27 15:34:06

by John Garry

[permalink] [raw]
Subject: [PATCH v4 2/5] dma-iommu: Add iommu_dma_opt_mapping_size()

Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
allows the drivers to know the optimal mapping limit and thus limit the
requested IOVA lengths.

This value is based on the IOVA rcache range limit, as IOVAs allocated
above this limit must always be newly allocated, which may be quite slow.

Signed-off-by: John Garry <[email protected]>
Reviewed-by: Damien Le Moal <[email protected]>
---
drivers/iommu/dma-iommu.c | 6 ++++++
drivers/iommu/iova.c | 5 +++++
include/linux/iova.h | 2 ++
3 files changed, 13 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f90251572a5d..9e1586447ee8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
}

+static size_t iommu_dma_opt_mapping_size(void)
+{
+ return iova_rcache_range();
+}
+
static const struct dma_map_ops iommu_dma_ops = {
.alloc = iommu_dma_alloc,
.free = iommu_dma_free,
@@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
.map_resource = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
.get_merge_boundary = iommu_dma_get_merge_boundary,
+ .opt_mapping_size = iommu_dma_opt_mapping_size,
};

/*
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..9f00b58d546e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
static void free_iova_rcaches(struct iova_domain *iovad);

+unsigned long iova_rcache_range(void)
+{
+ return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
+}
+
static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
{
struct iova_domain *iovad;
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..c6ba6d95d79c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
int iova_cache_get(void);
void iova_cache_put(void);

+unsigned long iova_rcache_range(void);
+
void free_iova(struct iova_domain *iovad, unsigned long pfn);
void __free_iova(struct iova_domain *iovad, struct iova *iova);
struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
--
2.35.3

2022-06-27 15:34:22

by John Garry

[permalink] [raw]
Subject: [PATCH v4 3/5] scsi: core: Cap shost max_sectors according to DMA mapping limits only once

The shost->max_sectors is repeatedly capped according to the host DMA
mapping limit for each sdev in __scsi_init_queue(). This is unnecessary, so
set only once when adding the host.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/hosts.c | 5 +++++
drivers/scsi/scsi_lib.c | 4 ----
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 8352f90d997d..d04bd2c7c9f1 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -236,6 +236,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,

shost->dma_dev = dma_dev;

+ if (dma_dev->dma_mask) {
+ shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+ dma_max_mapping_size(dma_dev) >> SECTOR_SHIFT);
+ }
+
error = scsi_mq_setup_tags(shost);
if (error)
goto fail;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6ffc9e4258a8..6ce8acea322a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1884,10 +1884,6 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize);
}

- if (dev->dma_mask) {
- shost->max_sectors = min_t(unsigned int, shost->max_sectors,
- dma_max_mapping_size(dev) >> SECTOR_SHIFT);
- }
blk_queue_max_hw_sectors(q, shost->max_sectors);
blk_queue_segment_boundary(q, shost->dma_boundary);
dma_set_seg_boundary(dev, shost->dma_boundary);
--
2.35.3

2022-06-27 15:34:43

by John Garry

[permalink] [raw]
Subject: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

ATA devices (struct ata_device) have a max_sectors field which is
configured internally in libata. This is then used to (re)configure the
associated sdev request queue max_sectors value from how it is earlier set
in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set
according to shost limits, which includes host DMA mapping limits.

Cap the ata_device max_sectors according to shost->max_sectors to respect
this shost limit.

Signed-off-by: John Garry <[email protected]>
Acked-by: Damien Le Moal <[email protected]>
---
drivers/ata/libata-scsi.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 86dbb1cdfabd..24a43d540d9f 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1060,6 +1060,7 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct ata_device *dev)
dev->flags |= ATA_DFLAG_NO_UNLOAD;

/* configure max sectors */
+ dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
blk_queue_max_hw_sectors(q, dev->max_sectors);

if (dev->class == ATA_DEV_ATAPI) {
--
2.35.3

2022-06-27 15:54:46

by John Garry

[permalink] [raw]
Subject: [PATCH v4 4/5] scsi: scsi_transport_sas: Cap shost max_sectors according to DMA optimal mapping limit

Streaming DMA mappings may be considerably slower when mappings go through
an IOMMU and the total mapping length is somewhat long. This is because the
IOMMU IOVA code allocates and free an IOVA for each mapping, which may
affect performance.

For performance reasons set the request queue max_sectors from
dma_opt_mapping_size(), which knows this mapping limit.

Signed-off-by: John Garry <[email protected]>
---
drivers/scsi/scsi_transport_sas.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index 12bff64dade6..1b45248748e0 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -225,6 +225,7 @@ static int sas_host_setup(struct transport_container *tc, struct device *dev,
{
struct Scsi_Host *shost = dev_to_shost(dev);
struct sas_host_attrs *sas_host = to_sas_host_attrs(shost);
+ struct device *dma_dev = shost->dma_dev;

INIT_LIST_HEAD(&sas_host->rphy_list);
mutex_init(&sas_host->lock);
@@ -236,6 +237,11 @@ static int sas_host_setup(struct transport_container *tc, struct device *dev,
dev_printk(KERN_ERR, dev, "fail to a bsg device %d\n",
shost->host_no);

+ if (dma_dev) {
+ shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+ dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT);
+ }
+
return 0;
}

--
2.35.3

2022-06-27 23:35:15

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On 6/28/22 00:25, John Garry wrote:
> ATA devices (struct ata_device) have a max_sectors field which is
> configured internally in libata. This is then used to (re)configure the
> associated sdev request queue max_sectors value from how it is earlier set
> in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set
> according to shost limits, which includes host DMA mapping limits.
>
> Cap the ata_device max_sectors according to shost->max_sectors to respect
> this shost limit.
>
> Signed-off-by: John Garry <[email protected]>
> Acked-by: Damien Le Moal <[email protected]>

Nit: please change the patch title to "ata: libata-scsi: Cap ..."

> ---
> drivers/ata/libata-scsi.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 86dbb1cdfabd..24a43d540d9f 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -1060,6 +1060,7 @@ int ata_scsi_dev_config(struct scsi_device *sdev, struct ata_device *dev)
> dev->flags |= ATA_DFLAG_NO_UNLOAD;
>
> /* configure max sectors */
> + dev->max_sectors = min(dev->max_sectors, sdev->host->max_sectors);
> blk_queue_max_hw_sectors(q, dev->max_sectors);
>
> if (dev->class == ATA_DEV_ATAPI) {


--
Damien Le Moal
Western Digital Research

2022-06-28 08:16:33

by John Garry

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On 28/06/2022 00:24, Damien Le Moal wrote:
> On 6/28/22 00:25, John Garry wrote:
>> ATA devices (struct ata_device) have a max_sectors field which is
>> configured internally in libata. This is then used to (re)configure the
>> associated sdev request queue max_sectors value from how it is earlier set
>> in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set
>> according to shost limits, which includes host DMA mapping limits.
>>
>> Cap the ata_device max_sectors according to shost->max_sectors to respect
>> this shost limit.
>>
>> Signed-off-by: John Garry<[email protected]>
>> Acked-by: Damien Le Moal<[email protected]>
> Nit: please change the patch title to "ata: libata-scsi: Cap ..."
>

ok, but it's going to be an even longer title :)

BTW, this patch has no real dependency on the rest of the series, so
could be taken separately if you prefer.

Thanks,
John

2022-06-28 09:25:09

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On 6/28/22 16:54, John Garry wrote:
> On 28/06/2022 00:24, Damien Le Moal wrote:
>> On 6/28/22 00:25, John Garry wrote:
>>> ATA devices (struct ata_device) have a max_sectors field which is
>>> configured internally in libata. This is then used to (re)configure the
>>> associated sdev request queue max_sectors value from how it is earlier set
>>> in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set
>>> according to shost limits, which includes host DMA mapping limits.
>>>
>>> Cap the ata_device max_sectors according to shost->max_sectors to respect
>>> this shost limit.
>>>
>>> Signed-off-by: John Garry<[email protected]>
>>> Acked-by: Damien Le Moal<[email protected]>
>> Nit: please change the patch title to "ata: libata-scsi: Cap ..."
>>
>
> ok, but it's going to be an even longer title :)
>
> BTW, this patch has no real dependency on the rest of the series, so
> could be taken separately if you prefer.

Sure, you can send it separately. Adding it through the scsi tree is fine too.

>
> Thanks,
> John


--
Damien Le Moal
Western Digital Research

2022-06-28 11:19:04

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] dma-iommu: Add iommu_dma_opt_mapping_size()

On 2022-06-27 16:25, John Garry wrote:
> Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
> allows the drivers to know the optimal mapping limit and thus limit the
> requested IOVA lengths.
>
> This value is based on the IOVA rcache range limit, as IOVAs allocated
> above this limit must always be newly allocated, which may be quite slow.

Acked-by: Robin Murphy <[email protected]>

> Signed-off-by: John Garry <[email protected]>
> Reviewed-by: Damien Le Moal <[email protected]>
> ---
> drivers/iommu/dma-iommu.c | 6 ++++++
> drivers/iommu/iova.c | 5 +++++
> include/linux/iova.h | 2 ++
> 3 files changed, 13 insertions(+)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f90251572a5d..9e1586447ee8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
> return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
> }
>
> +static size_t iommu_dma_opt_mapping_size(void)
> +{
> + return iova_rcache_range();
> +}
> +
> static const struct dma_map_ops iommu_dma_ops = {
> .alloc = iommu_dma_alloc,
> .free = iommu_dma_free,
> @@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
> .map_resource = iommu_dma_map_resource,
> .unmap_resource = iommu_dma_unmap_resource,
> .get_merge_boundary = iommu_dma_get_merge_boundary,
> + .opt_mapping_size = iommu_dma_opt_mapping_size,
> };
>
> /*
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index db77aa675145..9f00b58d546e 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
> static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
> static void free_iova_rcaches(struct iova_domain *iovad);
>
> +unsigned long iova_rcache_range(void)
> +{
> + return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
> +}
> +
> static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
> {
> struct iova_domain *iovad;
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 320a70e40233..c6ba6d95d79c 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
> int iova_cache_get(void);
> void iova_cache_put(void);
>
> +unsigned long iova_rcache_range(void);
> +
> void free_iova(struct iova_domain *iovad, unsigned long pfn);
> void __free_iova(struct iova_domain *iovad, struct iova *iova);
> struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,

2022-06-28 11:39:35

by John Garry

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On 28/06/2022 10:14, Damien Le Moal wrote:
>> BTW, this patch has no real dependency on the rest of the series, so
>> could be taken separately if you prefer.
> Sure, you can send it separately. Adding it through the scsi tree is fine too.
>

Well Christoph originally offered to take this series via the
dma-mapping tree.

@Christoph, is that still ok with you? If so, would you rather I send
this libata patch separately?

Thanks,
john

2022-06-29 05:50:44

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On Tue, Jun 28, 2022 at 12:33:58PM +0100, John Garry wrote:
> Well Christoph originally offered to take this series via the dma-mapping
> tree.
>
> @Christoph, is that still ok with you? If so, would you rather I send this
> libata patch separately?

The offer still stands, and I don't really care where the libata
patch is routed. Just tell me what you prefer.

2022-06-29 06:24:30

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On 6/29/22 14:40, Christoph Hellwig wrote:
> On Tue, Jun 28, 2022 at 12:33:58PM +0100, John Garry wrote:
>> Well Christoph originally offered to take this series via the dma-mapping
>> tree.
>>
>> @Christoph, is that still ok with you? If so, would you rather I send this
>> libata patch separately?
>
> The offer still stands, and I don't really care where the libata
> patch is routed. Just tell me what you prefer.

If it is 100% independent from the other patches, I can take it.
Otherwise, feel free to take it !

--
Damien Le Moal
Western Digital Research

2022-06-29 07:52:32

by John Garry

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On 29/06/2022 06:58, Damien Le Moal wrote:
> On 6/29/22 14:40, Christoph Hellwig wrote:
>> On Tue, Jun 28, 2022 at 12:33:58PM +0100, John Garry wrote:
>>> Well Christoph originally offered to take this series via the dma-mapping
>>> tree.
>>>
>>> @Christoph, is that still ok with you? If so, would you rather I send this
>>> libata patch separately?
>>
>> The offer still stands, and I don't really care where the libata
>> patch is routed. Just tell me what you prefer.

Cheers.

>
> If it is 100% independent from the other patches, I can take it.
> Otherwise, feel free to take it !
>

I'll just keep the all together - it's easier in case I need to change
anything.

Thanks!

2022-06-29 08:27:22

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

On 6/29/22 16:43, John Garry wrote:
> On 29/06/2022 06:58, Damien Le Moal wrote:
>> On 6/29/22 14:40, Christoph Hellwig wrote:
>>> On Tue, Jun 28, 2022 at 12:33:58PM +0100, John Garry wrote:
>>>> Well Christoph originally offered to take this series via the dma-mapping
>>>> tree.
>>>>
>>>> @Christoph, is that still ok with you? If so, would you rather I send this
>>>> libata patch separately?
>>>
>>> The offer still stands, and I don't really care where the libata
>>> patch is routed. Just tell me what you prefer.
>
> Cheers.
>
>>
>> If it is 100% independent from the other patches, I can take it.
>> Otherwise, feel free to take it !
>>
>
> I'll just keep the all together - it's easier in case I need to change
> anything.

Works for me.

>
> Thanks!


--
Damien Le Moal
Western Digital Research