2022-07-28 19:55:54

by Tariq Toukan

[permalink] [raw]
Subject: [PATCH net-next V4 0/3] Introduce and use NUMA distance metrics

Hi,

Implement and expose CPU spread API based on the scheduler's
sched_numa_find_closest(). Use it in mlx5 and enic device drivers. This
replaces the binary NUMA preference (local / remote) with an improved one
that minds the actual distances, so that remote NUMAs with short distance
are preferred over farther ones.

This has significant performance implications when using NUMA-aware
memory allocations, improving the throughput and CPU utilization.

Regards,
Tariq

v4:
- memset to zero the cpus array in case !CONFIG_SMP.

v3:
- Introduce the logic as a common API instead of being mlx5 specific.
- Add implementation to enic device driver.
- Use non-atomic version of __cpumask_clear_cpu.

v2:
- Replace EXPORT_SYMBOL with EXPORT_SYMBOL_GPL, per Peter's comment.
- Separate the set_cpu operation into two functions, per Saeed's suggestion.
- Add Saeed's Acked-by signature.


Tariq Toukan (3):
sched/topology: Add NUMA-based CPUs spread API
net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity
hints
enic: Use NUMA distances logic when setting affinity hints

drivers/net/ethernet/cisco/enic/enic_main.c | 10 +++-
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 5 +-
include/linux/sched/topology.h | 5 ++
kernel/sched/topology.c | 49 ++++++++++++++++++++
4 files changed, 65 insertions(+), 4 deletions(-)

--
2.21.0


2022-07-28 19:57:17

by Tariq Toukan

[permalink] [raw]
Subject: [PATCH net-next V4 3/3] enic: Use NUMA distances logic when setting affinity hints

Use the new CPU spread API to sort cpus preference of remote NUMA nodes
according to their distance.

Cc: Christian Benvenuti <[email protected]>
Cc: Govindarajulu Varadarajan <[email protected]>
Reviewed-by: Gal Pressman <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 372fb7b3a282..9de3c3ffa1e3 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -44,6 +44,7 @@
#include <linux/cpu_rmap.h>
#endif
#include <linux/crash_dump.h>
+#include <linux/sched/topology.h>
#include <net/busy_poll.h>
#include <net/vxlan.h>

@@ -114,8 +115,14 @@ static struct enic_intr_mod_range mod_range[ENIC_MAX_LINK_SPEEDS] = {
static void enic_init_affinity_hint(struct enic *enic)
{
int numa_node = dev_to_node(&enic->pdev->dev);
+ u16 *cpus;
int i;

+ cpus = kcalloc(enic->intr_count, sizeof(*cpus), GFP_KERNEL);
+ if (!cpus)
+ return;
+
+ sched_cpus_set_spread(numa_node, cpus, enic->intr_count);
for (i = 0; i < enic->intr_count; i++) {
if (enic_is_err_intr(enic, i) || enic_is_notify_intr(enic, i) ||
(cpumask_available(enic->msix[i].affinity_mask) &&
@@ -123,9 +130,10 @@ static void enic_init_affinity_hint(struct enic *enic)
continue;
if (zalloc_cpumask_var(&enic->msix[i].affinity_mask,
GFP_KERNEL))
- cpumask_set_cpu(cpumask_local_spread(i, numa_node),
+ cpumask_set_cpu(cpus[i],
enic->msix[i].affinity_mask);
}
+ kfree(cpus);
}

static void enic_free_affinity_hint(struct enic *enic)
--
2.21.0