For SCSI hosts which enable host_tagset the NUMA node returned from
blk_mq_hw_queue_to_node() is NUMA_NO_NODE always. Then, since in
scsi_mq_setup_tags() the default we choose for the tag_set NUMA node is
NUMA_NO_NODE, we always evaluate the NUMA node as NUMA_NO_NODE in
functions like blk_mq_alloc_rq_map().
The reason we get NUMA_NO_NODE from blk_mq_hw_queue_to_node() is that
the hctx_idx passed is BLK_MQ_NO_HCTX_IDX - so we can't match against a
(HW) queue mapping index.
Improve this by defaulting the tag_set NUMA node to the same NUMA node
of the SCSI host DMA dev.
Signed-off-by: John Garry <[email protected]>
---
Difference to v1:
- use dev_to_node()
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index f69b77cbf538..8352f90d997d 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -229,10 +229,6 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
if (error)
goto fail;
- error = scsi_mq_setup_tags(shost);
- if (error)
- goto fail;
-
if (!shost->shost_gendev.parent)
shost->shost_gendev.parent = dev ? dev : &platform_bus;
if (!dma_dev)
@@ -240,6 +236,10 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
shost->dma_dev = dma_dev;
+ error = scsi_mq_setup_tags(shost);
+ if (error)
+ goto fail;
+
/*
* Increase usage count temporarily here so that calling
* scsi_autopm_put_host() will trigger runtime idle if there is
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a7788184908e..e14ad193a9c8 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1977,7 +1977,7 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost)
tag_set->nr_maps = shost->nr_maps ? : 1;
tag_set->queue_depth = shost->can_queue;
tag_set->cmd_size = cmd_size;
- tag_set->numa_node = NUMA_NO_NODE;
+ tag_set->numa_node = dev_to_node(shost->dma_dev);
tag_set->flags = BLK_MQ_F_SHOULD_MERGE;
tag_set->flags |=
BLK_ALLOC_POLICY_TO_MQ_FLAG(shost->hostt->tag_alloc_policy);
--
2.26.2
On Wed, 30 Mar 2022 19:38:35 +0800, John Garry wrote:
> For SCSI hosts which enable host_tagset the NUMA node returned from
> blk_mq_hw_queue_to_node() is NUMA_NO_NODE always. Then, since in
> scsi_mq_setup_tags() the default we choose for the tag_set NUMA node is
> NUMA_NO_NODE, we always evaluate the NUMA node as NUMA_NO_NODE in
> functions like blk_mq_alloc_rq_map().
>
> The reason we get NUMA_NO_NODE from blk_mq_hw_queue_to_node() is that
> the hctx_idx passed is BLK_MQ_NO_HCTX_IDX - so we can't match against a
> (HW) queue mapping index.
>
> [...]
Applied to 5.19/scsi-queue, thanks!
[1/1] scsi: core: Refine how we set tag_set NUMA node
https://git.kernel.org/mkp/scsi/c/973dac8a8a14
--
Martin K. Petersen Oracle Linux Engineering