The drivers currently rely on irq_set_affinity_hint() to either set the
affinity_hint that is consumed by the userspace and/or to enforce a custom
affinity.
irq_set_affinity_hint() as the name suggests is originally introduced to
only set the affinity_hint to help the userspace in guiding the interrupts
and not the affinity itself. However, since the commit
e2e64a932556 "genirq: Set initial affinity in irq_set_affinity_hint()"
irq_set_affinity_hint() also started applying the provided cpumask (if not
NULL) as the affinity for the interrupts. The issue that this commit was
trying to solve is to allow the drivers to enforce their affinity mask to
distribute the interrupts across the CPUs such that they don't always end
up on CPU0. This issue has been resolved within the irq subsystem since the
commit
a0c9259dc4e1 "irq/matrix: Spread interrupts on allocation"
Hence, there is no need for the drivers to overwrite the affinity to spread
as it is dynamically performed at the time of allocation.
Also, irq_set_affinity_hint() setting affinity unconditionally introduces
issues for the drivers that only want to set their affinity_hint and not the
affinity itself as for these driver interrupts the default_smp_affinity_mask
is completely ignored (for detailed investigation please refer to [1]).
Unfortunately reverting the commit e2e64a932556 is not an option at this
point for two reasons [2]:
- Several drivers for a valid reason (performance) rely on this API to
enforce their affinity mask
- Until very recently this was the only exported interface that was
available
To clear this out Thomas has come up with the following interfaces:
- irq_set_affinity(): only sets affinity of an IRQ [3]
- irq_update_affinity_hint(): Only sets the hint [4]
- irq_set_affinity_and_hint(): Sets both affinity and the hint mask [4]
The first API is already merged in the linux-next tree and the patch
that introduces the other two interfaces are included with this patch-set.
To move to the stage where we can safely get rid of the
irq_set_affinity_hint(), which has been marked deprecated, we have to
move all its consumers to these new interfaces. In this patch-set, I have
done that for a few drivers and will hopefully try to move the remaining of
them in the coming days.
Testing
-------
In terms of testing, I have performed some basic testing on x86 to verify
things such as the interrupts are evenly spread on all CPUs, hint mask is
correctly set etc. for the drivers - i40e, iavf, mlx5, mlx4, ixgbe, i40iw
and enic on top of:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
So more testing is probably required for these and the drivers that I didn't
test and any help will be much appreciated.
Notes
-----
- I was told that i40iw driver is going to be replaced by irdma, however,
the new driver didn't land in Linus's tree yet. Once it does I will send
a follow up patch for that as well.
- For the mpt3sas driver I decided to go with the usage of
irq_set_affinity_and_hint over irq_set_affinity based on my little
analysis of it and the megaraid driver. However, if we are sure that it
is not required then I can replace it with just irq_set_affinity as one
of its comment suggests.
Change from v1 [5]
------------------
- Fixed compilation error by adding the new interface definitions for cases
where CONFIG_SMP is not defined
- Fixed function usage in megaraid_sas and removed unnecessary variable
(Robin Murphy)
- Removed unwanted #if/endif from mlx4 (Leon Romanovsky)
- Other indentation related fixes
[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/linux-pci/[email protected]/
[3] https://lore.kernel.org/linux-arm-kernel/[email protected]/
[4] https://lore.kernel.org/patchwork/patch/1434326/
[5] https://lore.kernel.org/linux-scsi/[email protected]/
Nitesh Narayan Lal (13):
iavf: Use irq_update_affinity_hint
i40e: Use irq_update_affinity_hint
scsi: megaraid_sas: Use irq_set_affinity_and_hint
scsi: mpt3sas: Use irq_set_affinity_and_hint
RDMA/i40iw: Use irq_update_affinity_hint
enic: Use irq_update_affinity_hint
be2net: Use irq_update_affinity_hint
ixgbe: Use irq_update_affinity_hint
mailbox: Use irq_update_affinity_hint
scsi: lpfc: Use irq_set_affinity
hinic: Use irq_set_affinity_and_hint
net/mlx5: Use irq_update_affinity_hint
net/mlx4: Use irq_update_affinity_hint
Thomas Gleixner (1):
genirq: Provide new interfaces for affinity hints
drivers/infiniband/hw/i40iw/i40iw_main.c | 4 +-
drivers/mailbox/bcm-flexrm-mailbox.c | 4 +-
drivers/net/ethernet/cisco/enic/enic_main.c | 8 +--
drivers/net/ethernet/emulex/benet/be_main.c | 4 +-
drivers/net/ethernet/huawei/hinic/hinic_rx.c | 4 +-
drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +--
drivers/net/ethernet/intel/iavf/iavf_main.c | 8 +--
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 ++--
drivers/net/ethernet/mellanox/mlx4/eq.c | 8 ++-
.../net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +--
drivers/scsi/lpfc/lpfc_init.c | 4 +-
drivers/scsi/megaraid/megaraid_sas_base.c | 27 +++++-----
drivers/scsi/mpt3sas/mpt3sas_base.c | 21 ++++----
include/linux/interrupt.h | 53 ++++++++++++++++++-
kernel/irq/manage.c | 8 +--
15 files changed, 113 insertions(+), 64 deletions(-)
--
The driver uses irq_set_affinity_hint() for two purposes:
- To set the affinity_hint which is consumed by the userspace for
distributing the interrupts
- To apply an affinity that it provides for the iavf interrupts
The latter is done to ensure that all the interrupts are evenly spread
across all available CPUs. However, since commit a0c9259dc4e1 ("irq/matrix:
Spread interrupts on allocation") the spreading of interrupts is
dynamically performed at the time of allocation. Hence, there is no need
for the drivers to enforce their own affinity for the spreading of
interrupts.
Also, irq_set_affinity_hint() applying the provided cpumask as an affinity
for the interrupt is an undocumented side effect. To remove this side
effect irq_set_affinity_hint() has been marked as deprecated and new
interfaces have been introduced. Hence, replace the irq_set_affinity_hint()
with the new interface irq_update_affinity_hint() that only sets the
pointer for the affinity_hint.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/net/ethernet/intel/iavf/iavf_main.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index e612c24fa384..f2e8fab53cb9 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -449,10 +449,10 @@ iavf_request_traffic_irqs(struct iavf_adapter *adapter, char *basename)
irq_set_affinity_notifier(irq_num, &q_vector->affinity_notify);
/* Spread the IRQ affinity hints across online CPUs. Note that
* get_cpu_mask returns a mask with a permanent lifetime so
- * it's safe to use as a hint for irq_set_affinity_hint.
+ * it's safe to use as a hint for irq_update_affinity_hint.
*/
cpu = cpumask_local_spread(q_vector->v_idx, -1);
- irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
+ irq_update_affinity_hint(irq_num, get_cpu_mask(cpu));
}
return 0;
@@ -462,7 +462,7 @@ iavf_request_traffic_irqs(struct iavf_adapter *adapter, char *basename)
vector--;
irq_num = adapter->msix_entries[vector + NONQ_VECS].vector;
irq_set_affinity_notifier(irq_num, NULL);
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
free_irq(irq_num, &adapter->q_vectors[vector]);
}
return err;
@@ -514,7 +514,7 @@ static void iavf_free_traffic_irqs(struct iavf_adapter *adapter)
for (vector = 0; vector < q_vectors; vector++) {
irq_num = adapter->msix_entries[vector + NONQ_VECS].vector;
irq_set_affinity_notifier(irq_num, NULL);
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
free_irq(irq_num, &adapter->q_vectors[vector]);
}
}
--
2.27.0
The driver uses irq_set_affinity_hint() specifically for the high IOPS
queue interrupts for two purposes:
- To set the affinity_hint which is consumed by the userspace for
distributing the interrupts
- To apply an affinity that it provides
The driver enforces its own affinity to bind the high IOPS queue interrupts
to the local NUMA node. However, irq_set_affinity_hint() applying the
provided cpumask as an affinity (if not NULL) for the interrupt is an
undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_set_affinity_and_hint()
that clearly indicates the purpose of the usage and is meant to apply the
affinity and set the affinity_hint pointer. Also, replace
irq_set_affinity_hint() with irq_update_affinity_hint() when only
affinity_hint needs to be updated.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/scsi/mpt3sas/mpt3sas_base.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 5779f313f6f8..d07cf66ecf09 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -2990,6 +2990,7 @@ _base_check_enable_msix(struct MPT3SAS_ADAPTER *ioc)
static void
_base_free_irq(struct MPT3SAS_ADAPTER *ioc)
{
+ unsigned int irq;
struct adapter_reply_queue *reply_q, *next;
if (list_empty(&ioc->reply_queue_list))
@@ -2997,9 +2998,10 @@ _base_free_irq(struct MPT3SAS_ADAPTER *ioc)
list_for_each_entry_safe(reply_q, next, &ioc->reply_queue_list, list) {
list_del(&reply_q->list);
- if (ioc->smp_affinity_enable)
- irq_set_affinity_hint(pci_irq_vector(ioc->pdev,
- reply_q->msix_index), NULL);
+ if (ioc->smp_affinity_enable) {
+ irq = pci_irq_vector(ioc->pdev, reply_q->msix_index);
+ irq_update_affinity_hint(irq, NULL);
+ }
free_irq(pci_irq_vector(ioc->pdev, reply_q->msix_index),
reply_q);
kfree(reply_q);
@@ -3055,16 +3057,13 @@ _base_request_irq(struct MPT3SAS_ADAPTER *ioc, u8 index)
* @ioc: per adapter object
*
* The enduser would need to set the affinity via /proc/irq/#/smp_affinity
- *
- * It would nice if we could call irq_set_affinity, however it is not
- * an exported symbol
*/
static void
_base_assign_reply_queues(struct MPT3SAS_ADAPTER *ioc)
{
- unsigned int cpu, nr_cpus, nr_msix, index = 0;
+ unsigned int cpu, nr_cpus, nr_msix, index = 0, irq;
struct adapter_reply_queue *reply_q;
- int local_numa_node;
+ const struct cpumask *mask;
if (!_base_is_controller_msix_enabled(ioc))
return;
@@ -3087,11 +3086,11 @@ _base_assign_reply_queues(struct MPT3SAS_ADAPTER *ioc)
* corresponding to high iops queues.
*/
if (ioc->high_iops_queues) {
- local_numa_node = dev_to_node(&ioc->pdev->dev);
+ mask = cpumask_of_node(dev_to_node(&ioc->pdev->dev));
for (index = 0; index < ioc->high_iops_queues;
index++) {
- irq_set_affinity_hint(pci_irq_vector(ioc->pdev,
- index), cpumask_of_node(local_numa_node));
+ irq = pci_irq_vector(ioc->pdev, index);
+ irq_set_affinity_and_hint(irq, mask);
}
}
--
2.27.0
From: Thomas Gleixner <[email protected]>
The discussion about removing the side effect of irq_set_affinity_hint() of
actually applying the cpumask (if not NULL) as affinity to the interrupt,
unearthed a few unpleasantries:
1) The modular perf drivers rely on the current behaviour for the very
wrong reasons.
2) While none of the other drivers prevents user space from changing
the affinity, a cursorily inspection shows that there are at least
expectations in some drivers.
#1 needs to be cleaned up anyway, so that's not a problem
#2 might result in subtle regressions especially when irqbalanced (which
nowadays ignores the affinity hint) is disabled.
Provide new interfaces:
irq_update_affinity_hint() - Only sets the affinity hint pointer
irq_set_affinity_and_hint() - Set the pointer and apply the affinity to
the interrupt
Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and
document it to be phased out.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Nitesh Narayan Lal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/linux/interrupt.h | 53 ++++++++++++++++++++++++++++++++++++++-
kernel/irq/manage.c | 8 +++---
2 files changed, 56 insertions(+), 5 deletions(-)
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 2ed65b01c961..bd25ba029c68 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned int irq, const struct cpumask *cpumask);
extern int irq_can_set_affinity(unsigned int irq);
extern int irq_select_affinity(unsigned int irq);
-extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m);
+extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
+ bool setaffinity);
+
+/**
+ * irq_update_affinity_hint - Update the affinity hint
+ * @irq: Interrupt to update
+ * @cpumask: cpumask pointer (NULL to clear the hint)
+ *
+ * Updates the affinity hint, but does not change the affinity of the interrupt.
+ */
+static inline int
+irq_update_affinity_hint(unsigned int irq, const struct cpumask *m)
+{
+ return __irq_apply_affinity_hint(irq, m, false);
+}
+
+/**
+ * irq_set_affinity_and_hint - Update the affinity hint and apply the provided
+ * cpumask to the interrupt
+ * @irq: Interrupt to update
+ * @cpumask: cpumask pointer (NULL to clear the hint)
+ *
+ * Updates the affinity hint and if @cpumask is not NULL it applies it as
+ * the affinity of that interrupt.
+ */
+static inline int
+irq_set_affinity_and_hint(unsigned int irq, const struct cpumask *m)
+{
+ return __irq_apply_affinity_hint(irq, m, true);
+}
+
+/*
+ * Deprecated. Use irq_update_affinity_hint() or irq_set_affinity_and_hint()
+ * instead.
+ */
+static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
+{
+ return irq_set_affinity_and_hint(irq, m);
+}
+
extern int irq_update_affinity_desc(unsigned int irq,
struct irq_affinity_desc *affinity);
@@ -360,6 +399,18 @@ static inline int irq_can_set_affinity(unsigned int irq)
static inline int irq_select_affinity(unsigned int irq) { return 0; }
+static inline int irq_update_affinity_hint(unsigned int irq,
+ const struct cpumask *m)
+{
+ return -EINVAL;
+}
+
+static inline int irq_set_affinity_and_hint(unsigned int irq,
+ const struct cpumask *m)
+{
+ return -EINVAL;
+}
+
static inline int irq_set_affinity_hint(unsigned int irq,
const struct cpumask *m)
{
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index ef30b4762947..837b63e63111 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, const struct cpumask *cpumask)
}
EXPORT_SYMBOL_GPL(irq_force_affinity);
-int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
+int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
+ bool setaffinity)
{
unsigned long flags;
struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
@@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
return -EINVAL;
desc->affinity_hint = m;
irq_put_desc_unlock(desc, flags);
- /* set the initial affinity to prevent every interrupt being on CPU0 */
- if (m)
+ if (m && setaffinity)
__irq_set_affinity(irq, m, false);
return 0;
}
-EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
+EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);
static void irq_affinity_notify(struct work_struct *work)
{
--
2.27.0
The driver uses irq_set_affinity_hint() for two purposes:
- To set the affinity_hint which is consumed by the userspace for
distributing the interrupts
- To apply an affinity that it provides for the i40e interrupts
The latter is done to ensure that all the interrupts are evenly spread
across all available CPUs. However, since commit a0c9259dc4e1 ("irq/matrix:
Spread interrupts on allocation") the spreading of interrupts is
dynamically performed at the time of allocation. Hence, there is no need
for the drivers to enforce their own affinity for the spreading of
interrupts.
Also, irq_set_affinity_hint() applying the provided cpumask as an affinity
for the interrupt is an undocumented side effect. To remove this side
effect irq_set_affinity_hint() has been marked as deprecated and new
interfaces have been introduced. Hence, replace the irq_set_affinity_hint()
with the new interface irq_update_affinity_hint() that only sets the
pointer for the affinity_hint.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 704e474879c5..a4439a86aeb8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -3873,10 +3873,10 @@ static int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename)
*
* get_cpu_mask returns a static constant mask with
* a permanent lifetime so it's ok to pass to
- * irq_set_affinity_hint without making a copy.
+ * irq_update_affinity_hint without making a copy.
*/
cpu = cpumask_local_spread(q_vector->v_idx, -1);
- irq_set_affinity_hint(irq_num, get_cpu_mask(cpu));
+ irq_update_affinity_hint(irq_num, get_cpu_mask(cpu));
}
vsi->irqs_ready = true;
@@ -3887,7 +3887,7 @@ static int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename)
vector--;
irq_num = pf->msix_entries[base + vector].vector;
irq_set_affinity_notifier(irq_num, NULL);
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
free_irq(irq_num, &vsi->q_vectors[vector]);
}
return err;
@@ -4695,7 +4695,7 @@ static void i40e_vsi_free_irq(struct i40e_vsi *vsi)
/* clear the affinity notifier in the IRQ descriptor */
irq_set_affinity_notifier(irq_num, NULL);
/* remove our suggested affinity mask for this IRQ */
- irq_set_affinity_hint(irq_num, NULL);
+ irq_update_affinity_hint(irq_num, NULL);
synchronize_irq(irq_num);
free_irq(irq_num, vsi->q_vectors[i]);
--
2.27.0
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index d0a8f7106958..97eb5bd62855 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -150,10 +150,10 @@ static void enic_set_affinity_hint(struct enic *enic)
!cpumask_available(enic->msix[i].affinity_mask) ||
cpumask_empty(enic->msix[i].affinity_mask))
continue;
- err = irq_set_affinity_hint(enic->msix_entry[i].vector,
- enic->msix[i].affinity_mask);
+ err = irq_update_affinity_hint(enic->msix_entry[i].vector,
+ enic->msix[i].affinity_mask);
if (err)
- netdev_warn(enic->netdev, "irq_set_affinity_hint failed, err %d\n",
+ netdev_warn(enic->netdev, "irq_update_affinity_hint failed, err %d\n",
err);
}
@@ -173,7 +173,7 @@ static void enic_unset_affinity_hint(struct enic *enic)
int i;
for (i = 0; i < enic->intr_count; i++)
- irq_set_affinity_hint(enic->msix_entry[i].vector, NULL);
+ irq_update_affinity_hint(enic->msix_entry[i].vector, NULL);
}
static int enic_udp_tunnel_set_port(struct net_device *netdev,
--
2.27.0
The driver uses irq_set_affinity_hint() specifically for the high IOPS
queue interrupts for two purposes:
- To set the affinity_hint which is consumed by the userspace for
distributing the interrupts
- To apply an affinity that it provides
The driver enforces its own affinity to bind the high IOPS queue interrupts
to the local NUMA node. However, irq_set_affinity_hint() applying the
provided cpumask as an affinity for the interrupt is an undocumented side
effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_set_affinity_and_hint()
that clearly indicates the purpose of the usage and is meant to apply the
affinity and set the affinity_hint pointer. Also, replace
irq_set_affinity_hint() with irq_update_affinity_hint() when only
affinity_hint needs to be updated.
Change the megasas_set_high_iops_queue_affinity_hint function name to
megasas_set_high_iops_queue_affinity_and_hint to clearly indicate that the
function is setting both affinity and affinity_hint.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/scsi/megaraid/megaraid_sas_base.c | 27 +++++++++++++----------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 4d4e9dbe5193..f59b99925830 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -5666,7 +5666,7 @@ megasas_setup_irqs_msix(struct megasas_instance *instance, u8 is_probe)
"Failed to register IRQ for vector %d.\n", i);
for (j = 0; j < i; j++) {
if (j < instance->low_latency_index_start)
- irq_set_affinity_hint(
+ irq_update_affinity_hint(
pci_irq_vector(pdev, j), NULL);
free_irq(pci_irq_vector(pdev, j),
&instance->irq_context[j]);
@@ -5709,7 +5709,7 @@ megasas_destroy_irqs(struct megasas_instance *instance) {
if (instance->msix_vectors)
for (i = 0; i < instance->msix_vectors; i++) {
if (i < instance->low_latency_index_start)
- irq_set_affinity_hint(
+ irq_update_affinity_hint(
pci_irq_vector(instance->pdev, i), NULL);
free_irq(pci_irq_vector(instance->pdev, i),
&instance->irq_context[i]);
@@ -5840,22 +5840,25 @@ int megasas_get_device_list(struct megasas_instance *instance)
}
/**
- * megasas_set_high_iops_queue_affinity_hint - Set affinity hint for high IOPS queues
- * @instance: Adapter soft state
- * return: void
+ * megasas_set_high_iops_queue_affinity_and_hint - Set affinity and hint
+ * for high IOPS queues
+ * @instance: Adapter soft state
+ * return: void
*/
static inline void
-megasas_set_high_iops_queue_affinity_hint(struct megasas_instance *instance)
+megasas_set_high_iops_queue_affinity_and_hint(struct megasas_instance *instance)
{
int i;
- int local_numa_node;
+ unsigned int irq;
+ const struct cpumask *mask;
if (instance->perf_mode == MR_BALANCED_PERF_MODE) {
- local_numa_node = dev_to_node(&instance->pdev->dev);
+ mask = cpumask_of_node(dev_to_node(&instance->pdev->dev));
- for (i = 0; i < instance->low_latency_index_start; i++)
- irq_set_affinity_hint(pci_irq_vector(instance->pdev, i),
- cpumask_of_node(local_numa_node));
+ for (i = 0; i < instance->low_latency_index_start; i++) {
+ irq = pci_irq_vector(instance->pdev, i);
+ irq_set_affinity_and_hint(irq, mask);
+ }
}
}
@@ -5944,7 +5947,7 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance)
instance->msix_vectors = 0;
if (instance->smp_affinity_enable)
- megasas_set_high_iops_queue_affinity_hint(instance);
+ megasas_set_high_iops_queue_affinity_and_hint(instance);
}
/**
--
2.27.0
The driver uses irq_set_affinity_hint() to:
- Set the affinity_hint which is consumed by the userspace for
distributing the interrupts
- Enforce affinity
As per commit 6ac17fe8c14a ("mailbox: bcm-flexrm-mailbox: Set IRQ affinity
hint for FlexRM ring IRQs") the latter is done to ensure that the FlexRM
ring interrupts are evenly spread across all available CPUs. However, since
commit a0c9259dc4e1 ("irq/matrix: Spread interrupts on allocation") the
spreading of interrupts is dynamically performed at the time of allocation.
Hence, there is no need for the drivers to enforce their own affinity for
the spreading of interrupts.
Also, irq_set_affinity_hint() applying the provided cpumask as an affinity
for the interrupt is an undocumented side effect. To remove this side
effect irq_set_affinity_hint() has been marked as deprecated and new
interfaces have been introduced. Hence, replace the irq_set_affinity_hint()
with the new interface irq_update_affinity_hint() that only sets the
affinity_hint pointer.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/mailbox/bcm-flexrm-mailbox.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/mailbox/bcm-flexrm-mailbox.c b/drivers/mailbox/bcm-flexrm-mailbox.c
index b4f33dc399a0..abdd06d1986a 100644
--- a/drivers/mailbox/bcm-flexrm-mailbox.c
+++ b/drivers/mailbox/bcm-flexrm-mailbox.c
@@ -1298,7 +1298,7 @@ static int flexrm_startup(struct mbox_chan *chan)
val = (num_online_cpus() < val) ? val / num_online_cpus() : 1;
cpumask_set_cpu((ring->num / val) % num_online_cpus(),
&ring->irq_aff_hint);
- ret = irq_set_affinity_hint(ring->irq, &ring->irq_aff_hint);
+ ret = irq_update_affinity_hint(ring->irq, &ring->irq_aff_hint);
if (ret) {
dev_err(ring->mbox->dev,
"failed to set IRQ affinity hint for ring%d\n",
@@ -1425,7 +1425,7 @@ static void flexrm_shutdown(struct mbox_chan *chan)
/* Release IRQ */
if (ring->irq_requested) {
- irq_set_affinity_hint(ring->irq, NULL);
+ irq_update_affinity_hint(ring->irq, NULL);
free_irq(ring->irq, ring);
ring->irq_requested = false;
}
--
2.27.0
The driver uses irq_set_affinity_hint() to:
- Set the affinity_hint which is consumed by the userspace for
distributing the interrupts
- Enforce affinity
As per commit 352f58b0d9f2 ("net-next/hinic: Set Rxq irq to specific cpu
for NUMA"), the hinic driver enforces its own affinity to bind IRQs to the
local NUMA node. However, irq_set_affinity_hint() applying the provided
cpumask as an affinity for the interrupt is an undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked as
deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface
irq_set_affinity_and_hint() that applies the affinity and updates the
affinity_hint pointer. Also, use irq_update_affinity() when only
affinity_hint needs to be updated.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/net/ethernet/huawei/hinic/hinic_rx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
index cce08647b9b2..c6cac4bbdb49 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
@@ -547,7 +547,7 @@ static int rx_request_irq(struct hinic_rxq *rxq)
goto err_req_irq;
cpumask_set_cpu(qp->q_id % num_online_cpus(), &rq->affinity_mask);
- err = irq_set_affinity_hint(rq->irq, &rq->affinity_mask);
+ err = irq_set_affinity_and_hint(rq->irq, &rq->affinity_mask);
if (err)
goto err_irq_affinity;
@@ -564,7 +564,7 @@ static void rx_free_irq(struct hinic_rxq *rxq)
{
struct hinic_rq *rq = rxq->rq;
- irq_set_affinity_hint(rq->irq, NULL);
+ irq_update_affinity_hint(rq->irq, NULL);
free_irq(rq->irq, rxq);
rx_del_napi(rxq);
}
--
2.27.0
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
index c3373fb1cd7f..a1b9434f4e25 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -276,8 +276,8 @@ static int set_comp_irq_affinity_hint(struct mlx5_core_dev *mdev, int i)
cpumask_set_cpu(cpumask_local_spread(i, mdev->priv.numa_node),
irq->mask);
if (IS_ENABLED(CONFIG_SMP) &&
- irq_set_affinity_hint(irqn, irq->mask))
- mlx5_core_warn(mdev, "irq_set_affinity_hint failed, irq 0x%.4x",
+ irq_update_affinity_hint(irqn, irq->mask))
+ mlx5_core_warn(mdev, "irq_update_affinity_hint failed, irq 0x%.4x",
irqn);
return 0;
@@ -291,7 +291,7 @@ static void clear_comp_irq_affinity_hint(struct mlx5_core_dev *mdev, int i)
irq = mlx5_irq_get(mdev, vecidx);
irqn = pci_irq_vector(mdev->pdev, vecidx);
- irq_set_affinity_hint(irqn, NULL);
+ irq_update_affinity_hint(irqn, NULL);
free_cpumask_var(irq->mask);
}
--
2.27.0
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 2ac5b82676f3..be8e4af6c283 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3243,8 +3243,8 @@ static int ixgbe_request_msix_irqs(struct ixgbe_adapter *adapter)
/* If Flow Director is enabled, set interrupt affinity */
if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
/* assign the mask for this irq */
- irq_set_affinity_hint(entry->vector,
- &q_vector->affinity_mask);
+ irq_update_affinity_hint(entry->vector,
+ &q_vector->affinity_mask);
}
}
@@ -3260,8 +3260,8 @@ static int ixgbe_request_msix_irqs(struct ixgbe_adapter *adapter)
free_queue_irqs:
while (vector) {
vector--;
- irq_set_affinity_hint(adapter->msix_entries[vector].vector,
- NULL);
+ irq_update_affinity_hint(adapter->msix_entries[vector].vector,
+ NULL);
free_irq(adapter->msix_entries[vector].vector,
adapter->q_vector[vector]);
}
@@ -3394,7 +3394,7 @@ static void ixgbe_free_irq(struct ixgbe_adapter *adapter)
continue;
/* clear the affinity_mask in the IRQ descriptor */
- irq_set_affinity_hint(entry->vector, NULL);
+ irq_update_affinity_hint(entry->vector, NULL);
free_irq(entry->vector, q_vector);
}
--
2.27.0
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/eq.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index 9e48509ed3b2..414e390e6b48 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -244,9 +244,9 @@ static void mlx4_set_eq_affinity_hint(struct mlx4_priv *priv, int vec)
cpumask_empty(eq->affinity_mask))
return;
- hint_err = irq_set_affinity_hint(eq->irq, eq->affinity_mask);
+ hint_err = irq_update_affinity_hint(eq->irq, eq->affinity_mask);
if (hint_err)
- mlx4_warn(dev, "irq_set_affinity_hint failed, err %d\n", hint_err);
+ mlx4_warn(dev, "irq_update_affinity_hint failed, err %d\n", hint_err);
}
#endif
@@ -1123,9 +1123,7 @@ static void mlx4_free_irqs(struct mlx4_dev *dev)
for (i = 0; i < dev->caps.num_comp_vectors + 1; ++i)
if (eq_table->eq[i].have_irq) {
free_cpumask_var(eq_table->eq[i].affinity_mask);
-#if defined(CONFIG_SMP)
- irq_set_affinity_hint(eq_table->eq[i].irq, NULL);
-#endif
+ irq_update_affinity_hint(eq_table->eq[i].irq, NULL);
free_irq(eq_table->eq[i].irq, eq_table->eq + i);
eq_table->eq[i].have_irq = 0;
}
--
2.27.0
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/net/ethernet/emulex/benet/be_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 7968568bbe21..72fef9d882e2 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3491,7 +3491,7 @@ static int be_msix_register(struct be_adapter *adapter)
if (status)
goto err_msix;
- irq_set_affinity_hint(vec, eqo->affinity_mask);
+ irq_update_affinity_hint(vec, eqo->affinity_mask);
}
return 0;
@@ -3552,7 +3552,7 @@ static void be_irq_unregister(struct be_adapter *adapter)
/* MSIx */
for_all_evt_queues(adapter, eqo, i) {
vec = be_msix_vec_get(adapter, eqo);
- irq_set_affinity_hint(vec, NULL);
+ irq_update_affinity_hint(vec, NULL);
free_irq(vec, eqo);
}
--
2.27.0
The driver uses irq_set_affinity_hint() to update the affinity_hint mask
that is consumed by the userspace to distribute the interrupts. However,
under the hood irq_set_affinity_hint() also applies the provided cpumask
(if not NULL) as the affinity for the given interrupt which is an
undocumented side effect.
To remove this side effect irq_set_affinity_hint() has been marked
as deprecated and new interfaces have been introduced. Hence, replace the
irq_set_affinity_hint() with the new interface irq_update_affinity_hint()
that only updates the affinity_hint pointer.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/infiniband/hw/i40iw/i40iw_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_main.c b/drivers/infiniband/hw/i40iw/i40iw_main.c
index b496f30ce066..433d91c30cae 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_main.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_main.c
@@ -266,7 +266,7 @@ static void i40iw_disable_irq(struct i40iw_sc_dev *dev,
i40iw_wr32(dev->hw, I40E_PFINT_DYN_CTLN(msix_vec->idx - 1), 0);
else
i40iw_wr32(dev->hw, I40E_VFINT_DYN_CTLN1(msix_vec->idx - 1), 0);
- irq_set_affinity_hint(msix_vec->irq, NULL);
+ irq_update_affinity_hint(msix_vec->irq, NULL);
free_irq(msix_vec->irq, dev_id);
}
@@ -696,7 +696,7 @@ static enum i40iw_status_code i40iw_configure_ceq_vector(struct i40iw_device *iw
cpumask_clear(&msix_vec->mask);
cpumask_set_cpu(msix_vec->cpu_affinity, &msix_vec->mask);
- irq_set_affinity_hint(msix_vec->irq, &msix_vec->mask);
+ irq_update_affinity_hint(msix_vec->irq, &msix_vec->mask);
if (status) {
i40iw_pr_err("ceq irq config fail\n");
--
2.27.0
The driver uses irq_set_affinity_hint to set the affinity for the lpfc
interrupts to a mask corresponding to the local NUMA node to avoid
performance overhead on AMD architectures.
However, irq_set_affinity_hint() setting the affinity is an undocumented
side effect that this function also sets the affinity under the hood.
To remove this side effect irq_set_affinity_hint() has been marked as
deprecated and new interfaces have been introduced.
Also, as per the commit dcaa21367938 ("scsi: lpfc: Change default IRQ model
on AMD architectures"):
"On AMD architecture, revert the irq allocation to the normal style
(non-managed) and then use irq_set_affinity_hint() to set the cpu affinity
and disable user-space rebalancing."
we don't really need to set the affinity_hint as user-space rebalancing for
the lpfc interrupts is not desired.
Hence, replace the irq_set_affinity_hint() with irq_set_affinity() which
only applies the affinity for the interrupts.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/scsi/lpfc/lpfc_init.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 5f018d02bf56..d6e48414018d 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -11360,7 +11360,7 @@ lpfc_irq_set_aff(struct lpfc_hba_eq_hdl *eqhdl, unsigned int cpu)
cpumask_clear(&eqhdl->aff_mask);
cpumask_set_cpu(cpu, &eqhdl->aff_mask);
irq_set_status_flags(eqhdl->irq, IRQ_NO_BALANCING);
- irq_set_affinity_hint(eqhdl->irq, &eqhdl->aff_mask);
+ irq_set_affinity(eqhdl->irq, &eqhdl->aff_mask);
}
/**
@@ -11649,7 +11649,6 @@ lpfc_sli4_enable_msix(struct lpfc_hba *phba)
for (--index; index >= 0; index--) {
eqhdl = lpfc_get_eq_hdl(index);
lpfc_irq_clear_aff(eqhdl);
- irq_set_affinity_hint(eqhdl->irq, NULL);
free_irq(eqhdl->irq, eqhdl);
}
@@ -11810,7 +11809,6 @@ lpfc_sli4_disable_intr(struct lpfc_hba *phba)
for (index = 0; index < phba->cfg_irq_chann; index++) {
eqhdl = lpfc_get_eq_hdl(index);
lpfc_irq_clear_aff(eqhdl);
- irq_set_affinity_hint(eqhdl->irq, NULL);
free_irq(eqhdl->irq, eqhdl);
}
} else {
--
2.27.0
On Tue, Jun 29, 2021 at 11:28 AM Nitesh Narayan Lal <[email protected]> wrote:
>
> The drivers currently rely on irq_set_affinity_hint() to either set the
> affinity_hint that is consumed by the userspace and/or to enforce a custom
> affinity.
>
> irq_set_affinity_hint() as the name suggests is originally introduced to
> only set the affinity_hint to help the userspace in guiding the interrupts
> and not the affinity itself. However, since the commit
>
> e2e64a932556 "genirq: Set initial affinity in irq_set_affinity_hint()"
>
> irq_set_affinity_hint() also started applying the provided cpumask (if not
> NULL) as the affinity for the interrupts. The issue that this commit was
> trying to solve is to allow the drivers to enforce their affinity mask to
> distribute the interrupts across the CPUs such that they don't always end
> up on CPU0. This issue has been resolved within the irq subsystem since the
> commit
>
> a0c9259dc4e1 "irq/matrix: Spread interrupts on allocation"
>
> Hence, there is no need for the drivers to overwrite the affinity to spread
> as it is dynamically performed at the time of allocation.
>
> Also, irq_set_affinity_hint() setting affinity unconditionally introduces
> issues for the drivers that only want to set their affinity_hint and not the
> affinity itself as for these driver interrupts the default_smp_affinity_mask
> is completely ignored (for detailed investigation please refer to [1]).
>
> Unfortunately reverting the commit e2e64a932556 is not an option at this
> point for two reasons [2]:
>
> - Several drivers for a valid reason (performance) rely on this API to
> enforce their affinity mask
>
> - Until very recently this was the only exported interface that was
> available
>
> To clear this out Thomas has come up with the following interfaces:
>
> - irq_set_affinity(): only sets affinity of an IRQ [3]
> - irq_update_affinity_hint(): Only sets the hint [4]
> - irq_set_affinity_and_hint(): Sets both affinity and the hint mask [4]
>
> The first API is already merged in the linux-next tree and the patch
> that introduces the other two interfaces are included with this patch-set.
>
> To move to the stage where we can safely get rid of the
> irq_set_affinity_hint(), which has been marked deprecated, we have to
> move all its consumers to these new interfaces. In this patch-set, I have
> done that for a few drivers and will hopefully try to move the remaining of
> them in the coming days.
>
> Testing
> -------
> In terms of testing, I have performed some basic testing on x86 to verify
> things such as the interrupts are evenly spread on all CPUs, hint mask is
> correctly set etc. for the drivers - i40e, iavf, mlx5, mlx4, ixgbe, i40iw
> and enic on top of:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
>
> So more testing is probably required for these and the drivers that I didn't
> test and any help will be much appreciated.
>
>
> Notes
> -----
> - I was told that i40iw driver is going to be replaced by irdma, however,
> the new driver didn't land in Linus's tree yet. Once it does I will send
> a follow up patch for that as well.
>
> - For the mpt3sas driver I decided to go with the usage of
> irq_set_affinity_and_hint over irq_set_affinity based on my little
> analysis of it and the megaraid driver. However, if we are sure that it
> is not required then I can replace it with just irq_set_affinity as one
> of its comment suggests.
>
>
> Change from v1 [5]
> ------------------
> - Fixed compilation error by adding the new interface definitions for cases
> where CONFIG_SMP is not defined
>
> - Fixed function usage in megaraid_sas and removed unnecessary variable
> (Robin Murphy)
>
> - Removed unwanted #if/endif from mlx4 (Leon Romanovsky)
>
> - Other indentation related fixes
>
>
> [1] https://lore.kernel.org/lkml/[email protected]/
> [2] https://lore.kernel.org/linux-pci/[email protected]/
> [3] https://lore.kernel.org/linux-arm-kernel/[email protected]/
> [4] https://lore.kernel.org/patchwork/patch/1434326/
> [5] https://lore.kernel.org/linux-scsi/[email protected]/
>
>
> Nitesh Narayan Lal (13):
> iavf: Use irq_update_affinity_hint
> i40e: Use irq_update_affinity_hint
> scsi: megaraid_sas: Use irq_set_affinity_and_hint
> scsi: mpt3sas: Use irq_set_affinity_and_hint
> RDMA/i40iw: Use irq_update_affinity_hint
> enic: Use irq_update_affinity_hint
> be2net: Use irq_update_affinity_hint
> ixgbe: Use irq_update_affinity_hint
> mailbox: Use irq_update_affinity_hint
> scsi: lpfc: Use irq_set_affinity
> hinic: Use irq_set_affinity_and_hint
> net/mlx5: Use irq_update_affinity_hint
> net/mlx4: Use irq_update_affinity_hint
>
> Thomas Gleixner (1):
> genirq: Provide new interfaces for affinity hints
>
> drivers/infiniband/hw/i40iw/i40iw_main.c | 4 +-
> drivers/mailbox/bcm-flexrm-mailbox.c | 4 +-
> drivers/net/ethernet/cisco/enic/enic_main.c | 8 +--
> drivers/net/ethernet/emulex/benet/be_main.c | 4 +-
> drivers/net/ethernet/huawei/hinic/hinic_rx.c | 4 +-
> drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +--
> drivers/net/ethernet/intel/iavf/iavf_main.c | 8 +--
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 ++--
> drivers/net/ethernet/mellanox/mlx4/eq.c | 8 ++-
> .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +--
> drivers/scsi/lpfc/lpfc_init.c | 4 +-
> drivers/scsi/megaraid/megaraid_sas_base.c | 27 +++++-----
> drivers/scsi/mpt3sas/mpt3sas_base.c | 21 ++++----
> include/linux/interrupt.h | 53 ++++++++++++++++++-
> kernel/irq/manage.c | 8 +--
> 15 files changed, 113 insertions(+), 64 deletions(-)
>
> --
>
>
Gentle ping.
Any comments or suggestions on any of the patches included in this series?
--
Thanks
Nitesh
On Thu, Jul 08, 2021 at 03:24:20PM -0400, Nitesh Lal wrote:
> On Tue, Jun 29, 2021 at 11:28 AM Nitesh Narayan Lal <[email protected]> wrote:
<...>
> >
> > drivers/infiniband/hw/i40iw/i40iw_main.c | 4 +-
> > drivers/mailbox/bcm-flexrm-mailbox.c | 4 +-
> > drivers/net/ethernet/cisco/enic/enic_main.c | 8 +--
> > drivers/net/ethernet/emulex/benet/be_main.c | 4 +-
> > drivers/net/ethernet/huawei/hinic/hinic_rx.c | 4 +-
> > drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +--
> > drivers/net/ethernet/intel/iavf/iavf_main.c | 8 +--
> > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 ++--
> > drivers/net/ethernet/mellanox/mlx4/eq.c | 8 ++-
> > .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +--
> > drivers/scsi/lpfc/lpfc_init.c | 4 +-
> > drivers/scsi/megaraid/megaraid_sas_base.c | 27 +++++-----
> > drivers/scsi/mpt3sas/mpt3sas_base.c | 21 ++++----
> > include/linux/interrupt.h | 53 ++++++++++++++++++-
> > kernel/irq/manage.c | 8 +--
> > 15 files changed, 113 insertions(+), 64 deletions(-)
> >
> > --
> >
> >
>
> Gentle ping.
> Any comments or suggestions on any of the patches included in this series?
Please wait for -rc1, rebase and resend.
At least i40iw was deleted during merge window.
Thanks
>
> --
> Thanks
> Nitesh
>
On Sun, Jul 11, 2021 at 7:32 AM Leon Romanovsky <[email protected]> wrote:
>
> On Thu, Jul 08, 2021 at 03:24:20PM -0400, Nitesh Lal wrote:
> > On Tue, Jun 29, 2021 at 11:28 AM Nitesh Narayan Lal <[email protected]> wrote:
>
> <...>
>
> > >
> > > drivers/infiniband/hw/i40iw/i40iw_main.c | 4 +-
> > > drivers/mailbox/bcm-flexrm-mailbox.c | 4 +-
> > > drivers/net/ethernet/cisco/enic/enic_main.c | 8 +--
> > > drivers/net/ethernet/emulex/benet/be_main.c | 4 +-
> > > drivers/net/ethernet/huawei/hinic/hinic_rx.c | 4 +-
> > > drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +--
> > > drivers/net/ethernet/intel/iavf/iavf_main.c | 8 +--
> > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 ++--
> > > drivers/net/ethernet/mellanox/mlx4/eq.c | 8 ++-
> > > .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +--
> > > drivers/scsi/lpfc/lpfc_init.c | 4 +-
> > > drivers/scsi/megaraid/megaraid_sas_base.c | 27 +++++-----
> > > drivers/scsi/mpt3sas/mpt3sas_base.c | 21 ++++----
> > > include/linux/interrupt.h | 53 ++++++++++++++++++-
> > > kernel/irq/manage.c | 8 +--
> > > 15 files changed, 113 insertions(+), 64 deletions(-)
> > >
> > > --
> > >
> > >
> >
> > Gentle ping.
> > Any comments or suggestions on any of the patches included in this series?
>
> Please wait for -rc1, rebase and resend.
> At least i40iw was deleted during merge window.
>
Right, will rebase on top of 5.14-rc1 and resend.
--
Thanks
Nitesh
Hi Leon,
On Sun, Jul 11, 2021 at 7:32 AM Leon Romanovsky <[email protected]> wrote:
>
> On Thu, Jul 08, 2021 at 03:24:20PM -0400, Nitesh Lal wrote:
> > On Tue, Jun 29, 2021 at 11:28 AM Nitesh Narayan Lal <[email protected]> wrote:
>
> <...>
>
> > >
> > > drivers/infiniband/hw/i40iw/i40iw_main.c | 4 +-
> > > drivers/mailbox/bcm-flexrm-mailbox.c | 4 +-
> > > drivers/net/ethernet/cisco/enic/enic_main.c | 8 +--
> > > drivers/net/ethernet/emulex/benet/be_main.c | 4 +-
> > > drivers/net/ethernet/huawei/hinic/hinic_rx.c | 4 +-
> > > drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +--
> > > drivers/net/ethernet/intel/iavf/iavf_main.c | 8 +--
> > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 ++--
> > > drivers/net/ethernet/mellanox/mlx4/eq.c | 8 ++-
> > > .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +--
> > > drivers/scsi/lpfc/lpfc_init.c | 4 +-
> > > drivers/scsi/megaraid/megaraid_sas_base.c | 27 +++++-----
> > > drivers/scsi/mpt3sas/mpt3sas_base.c | 21 ++++----
> > > include/linux/interrupt.h | 53 ++++++++++++++++++-
> > > kernel/irq/manage.c | 8 +--
> > > 15 files changed, 113 insertions(+), 64 deletions(-)
> > >
> > > --
> > >
> > >
> >
> > Gentle ping.
> > Any comments or suggestions on any of the patches included in this series?
>
> Please wait for -rc1, rebase and resend.
> At least i40iw was deleted during merge window.
>
In -rc1 some non-trivial mlx5 changes also went in. I was going through
these changes and it seems after your patch
e4e3f24b822f: ("net/mlx5: Provide cpumask at EQ creation phase")
we do want to control the affinity for the mlx5 interrupts from the driver.
Is that correct? This would mean that we should use
irq_set_affinity_and_hint() instead
of irq_update_affinity_hint().
--
Thanks
Nitesh
On Mon, Jul 12, 2021 at 05:27:05PM -0400, Nitesh Lal wrote:
> Hi Leon,
>
> On Sun, Jul 11, 2021 at 7:32 AM Leon Romanovsky <[email protected]> wrote:
> >
> > On Thu, Jul 08, 2021 at 03:24:20PM -0400, Nitesh Lal wrote:
> > > On Tue, Jun 29, 2021 at 11:28 AM Nitesh Narayan Lal <[email protected]> wrote:
> >
> > <...>
> >
> > > >
> > > > drivers/infiniband/hw/i40iw/i40iw_main.c | 4 +-
> > > > drivers/mailbox/bcm-flexrm-mailbox.c | 4 +-
> > > > drivers/net/ethernet/cisco/enic/enic_main.c | 8 +--
> > > > drivers/net/ethernet/emulex/benet/be_main.c | 4 +-
> > > > drivers/net/ethernet/huawei/hinic/hinic_rx.c | 4 +-
> > > > drivers/net/ethernet/intel/i40e/i40e_main.c | 8 +--
> > > > drivers/net/ethernet/intel/iavf/iavf_main.c | 8 +--
> > > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 ++--
> > > > drivers/net/ethernet/mellanox/mlx4/eq.c | 8 ++-
> > > > .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +--
> > > > drivers/scsi/lpfc/lpfc_init.c | 4 +-
> > > > drivers/scsi/megaraid/megaraid_sas_base.c | 27 +++++-----
> > > > drivers/scsi/mpt3sas/mpt3sas_base.c | 21 ++++----
> > > > include/linux/interrupt.h | 53 ++++++++++++++++++-
> > > > kernel/irq/manage.c | 8 +--
> > > > 15 files changed, 113 insertions(+), 64 deletions(-)
> > > >
> > > > --
> > > >
> > > >
> > >
> > > Gentle ping.
> > > Any comments or suggestions on any of the patches included in this series?
> >
> > Please wait for -rc1, rebase and resend.
> > At least i40iw was deleted during merge window.
> >
>
> In -rc1 some non-trivial mlx5 changes also went in. I was going through
> these changes and it seems after your patch
>
> e4e3f24b822f: ("net/mlx5: Provide cpumask at EQ creation phase")
>
> we do want to control the affinity for the mlx5 interrupts from the driver.
> Is that correct?
We would like to create devices with correct affinity from the
beginning. For this, we will introduce extension to devlink to control
affinity that will be used prior initialization sequence.
Currently, netdev users who don't want irqbalance are digging into
their procfs, reconfigure affinity on already existing devices and
hope for the best.
This is even more cumbersome for the SIOV use case, where every physical
NIC PCI device will/can create thousands of lightweights netdevs that will
be forwarded to the containers later. These containers are limited to known
CPU cores, so no reason do not limit netdev device too.
The same goes for other sub-functions of that PCI device, like RDMA,
vdpa e.t.c.
> This would mean that we should use irq_set_affinity_and_hint() instead
> of irq_update_affinity_hint().
I think so.
Thanks
>
> --
> Thanks
> Nitesh
>
On Tue, Jul 13, 2021 at 1:01 AM Leon Romanovsky <[email protected]> wrote:
>
> On Mon, Jul 12, 2021 at 05:27:05PM -0400, Nitesh Lal wrote:
> > Hi Leon,
> >
<snip>
> > > >
> > > > Gentle ping.
> > > > Any comments or suggestions on any of the patches included in this series?
> > >
> > > Please wait for -rc1, rebase and resend.
> > > At least i40iw was deleted during merge window.
> > >
> >
> > In -rc1 some non-trivial mlx5 changes also went in. I was going through
> > these changes and it seems after your patch
> >
> > e4e3f24b822f: ("net/mlx5: Provide cpumask at EQ creation phase")
> >
> > we do want to control the affinity for the mlx5 interrupts from the driver.
> > Is that correct?
>
> We would like to create devices with correct affinity from the
> beginning. For this, we will introduce extension to devlink to control
> affinity that will be used prior initialization sequence.
>
> Currently, netdev users who don't want irqbalance are digging into
> their procfs, reconfigure affinity on already existing devices and
> hope for the best.
>
> This is even more cumbersome for the SIOV use case, where every physical
> NIC PCI device will/can create thousands of lightweights netdevs that will
> be forwarded to the containers later. These containers are limited to known
> CPU cores, so no reason do not limit netdev device too.
>
> The same goes for other sub-functions of that PCI device, like RDMA,
> vdpa e.t.c.
>
> > This would mean that we should use irq_set_affinity_and_hint() instead
> > of irq_update_affinity_hint().
>
> I think so.
>
Thanks, will make that change in the patch and re-send.
I will also drop your reviewed-by for the mlx5 patch so that you can
have a look at it again, please let me know if you have any
objections.
--
Thanks
Nitesh