2024-04-03 16:01:55

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 00/10] perf: Avoid placing cpumask on the stack

Hi all,

This is v3 of previous series on elimination of placing cpumask var on
stack.

Change since v2:
- Update commit message for patch[2-10]. [Mark]
- Add Reviewed-by from Mark for patch[2-10].

Change since v1:
- Change from dynamic allocation to a temporary var free helper:
cpumask_any_and_but(). [Mark]

- Some minor coding style improvements, reverse chrismas tree e.g.

- For cpumask_any_and_but() itself:
- Moved to cpumask.h, just like other helpers.
- Return value converted to unsigned int.
- Remove EXPORT_SYMBOL, for obvious reason.

v1:
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

v2:
https://lore.kernel.org/lkml/[email protected]/

Dawei Li (9):
perf/alibaba_uncore_drw: Avoid placing cpumask on the stack
perf/arm-cmn: Avoid placing cpumask on the stack
perf/arm_cspmu: Avoid placing cpumask on the stack
perf/arm_dsu: Avoid placing cpumask on the stack
perf/dwc_pcie: Avoid placing cpumask on the stack
perf/hisi_pcie: Avoid placing cpumask on the stack
perf/hisi_uncore: Avoid placing cpumask on the stack
perf/qcom_l2: Avoid placing cpumask on the stack
perf/thunderx2: Avoid placing cpumask on the stack

Mark Rutland (1):
cpumask: add cpumask_any_and_but()

drivers/perf/alibaba_uncore_drw_pmu.c | 10 +++-------
drivers/perf/arm-cmn.c | 10 +++++-----
drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++-----
drivers/perf/arm_dsu_pmu.c | 19 ++++++-------------
drivers/perf/dwc_pcie_pmu.c | 10 ++++------
drivers/perf/hisilicon/hisi_pcie_pmu.c | 9 ++++-----
drivers/perf/hisilicon/hisi_uncore_pmu.c | 6 ++----
drivers/perf/qcom_l2_pmu.c | 8 +++-----
drivers/perf/thunderx2_pmu.c | 10 +++-------
include/linux/cpumask.h | 23 +++++++++++++++++++++++
10 files changed, 56 insertions(+), 57 deletions(-)

Thanks,

Dawei

--
2.27.0



2024-04-03 16:02:45

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 03/10] perf/arm-cmn: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/arm-cmn.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
index 7ef9c7e4836b..6bfb0c4a1287 100644
--- a/drivers/perf/arm-cmn.c
+++ b/drivers/perf/arm-cmn.c
@@ -1950,20 +1950,20 @@ static int arm_cmn_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_no
struct arm_cmn *cmn;
unsigned int target;
int node;
- cpumask_t mask;

cmn = hlist_entry_safe(cpuhp_node, struct arm_cmn, cpuhp_node);
if (cpu != cmn->cpu)
return 0;

node = dev_to_node(cmn->dev);
- if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
- cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
- target = cpumask_any(&mask);
- else
+
+ target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
+ if (target >= nr_cpu_ids)
target = cpumask_any_but(cpu_online_mask, cpu);
+
if (target < nr_cpu_ids)
arm_cmn_migrate(cmn, target);
+
return 0;
}

--
2.27.0


2024-04-03 16:02:54

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/alibaba_uncore_drw_pmu.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
index a9277dcf90ce..d4d14b65c4a5 100644
--- a/drivers/perf/alibaba_uncore_drw_pmu.c
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -746,18 +746,14 @@ static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
struct ali_drw_pmu_irq *irq;
struct ali_drw_pmu *drw_pmu;
unsigned int target;
- int ret;
- cpumask_t node_online_cpus;

irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
if (cpu != irq->cpu)
return 0;

- ret = cpumask_and(&node_online_cpus,
- cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
- if (ret)
- target = cpumask_any_but(&node_online_cpus, cpu);
- else
+ target = cpumask_any_and_but(cpumask_of_node(cpu_to_node(cpu)),
+ cpu_online_mask, cpu);
+ if (target >= nr_cpu_ids)
target = cpumask_any_but(cpu_online_mask, cpu);

if (target >= nr_cpu_ids)
--
2.27.0


2024-04-03 16:04:07

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 04/10] perf/arm_cspmu: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index b9a252272f1e..fd1004251665 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -1322,8 +1322,7 @@ static int arm_cspmu_cpu_online(unsigned int cpu, struct hlist_node *node)

static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
{
- int dst;
- struct cpumask online_supported;
+ unsigned int dst;

struct arm_cspmu *cspmu =
hlist_entry_safe(node, struct arm_cspmu, cpuhp_node);
@@ -1333,9 +1332,8 @@ static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
return 0;

/* Choose a new CPU to migrate ownership of the PMU to */
- cpumask_and(&online_supported, &cspmu->associated_cpus,
- cpu_online_mask);
- dst = cpumask_any_but(&online_supported, cpu);
+ dst = cpumask_any_and_but(&cspmu->associated_cpus,
+ cpu_online_mask, cpu);
if (dst >= nr_cpu_ids)
return 0;

--
2.27.0


2024-04-03 16:04:43

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 05/10] perf/arm_dsu: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/arm_dsu_pmu.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index bae3ca37f846..adc0bbb5fafe 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -230,15 +230,6 @@ static const struct attribute_group *dsu_pmu_attr_groups[] = {
NULL,
};

-static int dsu_pmu_get_online_cpu_any_but(struct dsu_pmu *dsu_pmu, int cpu)
-{
- struct cpumask online_supported;
-
- cpumask_and(&online_supported,
- &dsu_pmu->associated_cpus, cpu_online_mask);
- return cpumask_any_but(&online_supported, cpu);
-}
-
static inline bool dsu_pmu_counter_valid(struct dsu_pmu *dsu_pmu, u32 idx)
{
return (idx < dsu_pmu->num_counters) ||
@@ -827,14 +818,16 @@ static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)

static int dsu_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
{
- int dst;
- struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
- cpuhp_node);
+ struct dsu_pmu *dsu_pmu;
+ unsigned int dst;
+
+ dsu_pmu = hlist_entry_safe(node, struct dsu_pmu, cpuhp_node);

if (!cpumask_test_and_clear_cpu(cpu, &dsu_pmu->active_cpu))
return 0;

- dst = dsu_pmu_get_online_cpu_any_but(dsu_pmu, cpu);
+ dst = cpumask_any_and_but(&dsu_pmu->associated_cpus,
+ cpu_online_mask, cpu);
/* If there are no active CPUs in the DSU, leave IRQ disabled */
if (dst >= nr_cpu_ids)
return 0;
--
2.27.0


2024-04-03 16:04:58

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 09/10] perf/qcom_l2: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/qcom_l2_pmu.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/qcom_l2_pmu.c b/drivers/perf/qcom_l2_pmu.c
index 148df5ae8ef8..b5a44dc1dc3a 100644
--- a/drivers/perf/qcom_l2_pmu.c
+++ b/drivers/perf/qcom_l2_pmu.c
@@ -801,9 +801,8 @@ static int l2cache_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)

static int l2cache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
{
- struct cluster_pmu *cluster;
struct l2cache_pmu *l2cache_pmu;
- cpumask_t cluster_online_cpus;
+ struct cluster_pmu *cluster;
unsigned int target;

l2cache_pmu = hlist_entry_safe(node, struct l2cache_pmu, node);
@@ -820,9 +819,8 @@ static int l2cache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
cluster->on_cpu = -1;

/* Any other CPU for this cluster which is still online */
- cpumask_and(&cluster_online_cpus, &cluster->cluster_cpus,
- cpu_online_mask);
- target = cpumask_any_but(&cluster_online_cpus, cpu);
+ target = cpumask_any_and_but(&cluster->cluster_cpus,
+ cpu_online_mask, cpu);
if (target >= nr_cpu_ids) {
disable_irq(cluster->irq);
return 0;
--
2.27.0


2024-04-03 16:05:40

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 10/10] perf/thunderx2: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/thunderx2_pmu.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
index e16d10c763de..b3377b662ffc 100644
--- a/drivers/perf/thunderx2_pmu.c
+++ b/drivers/perf/thunderx2_pmu.c
@@ -932,9 +932,8 @@ static int tx2_uncore_pmu_online_cpu(unsigned int cpu,
static int tx2_uncore_pmu_offline_cpu(unsigned int cpu,
struct hlist_node *hpnode)
{
- int new_cpu;
struct tx2_uncore_pmu *tx2_pmu;
- struct cpumask cpu_online_mask_temp;
+ unsigned int new_cpu;

tx2_pmu = hlist_entry_safe(hpnode,
struct tx2_uncore_pmu, hpnode);
@@ -945,11 +944,8 @@ static int tx2_uncore_pmu_offline_cpu(unsigned int cpu,
if (tx2_pmu->hrtimer_callback)
hrtimer_cancel(&tx2_pmu->hrtimer);

- cpumask_copy(&cpu_online_mask_temp, cpu_online_mask);
- cpumask_clear_cpu(cpu, &cpu_online_mask_temp);
- new_cpu = cpumask_any_and(
- cpumask_of_node(tx2_pmu->node),
- &cpu_online_mask_temp);
+ new_cpu = cpumask_any_and_but(cpumask_of_node(tx2_pmu->node),
+ cpu_online_mask, cpu);

tx2_pmu->cpu = new_cpu;
if (new_cpu >= nr_cpu_ids)
--
2.27.0


2024-04-03 16:08:02

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 01/10] cpumask: add cpumask_any_and_but()

From: Mark Rutland <[email protected]>

In some cases, it's useful to be able to select a random cpu from the
intersection of two masks, excluding a particular CPU.

For example, in some systems an uncore PMU is shared by a subset of
CPUs, and management of this PMU is assigned to some arbitrary CPU in
this set. Whenever the management CPU is hotplugged out, we wish to
migrate responsibility to another arbitrary CPU which is both in this
set and online.

Today we can use cpumask_any_and() to select an arbitrary CPU in the
intersection of two masks. We can also use cpumask_any_but() to select
any arbitrary cpu in a mask excluding, a particular CPU.

To do both, we either need to use a temporary cpumask, which is
wasteful, or use some lower-level cpumask helpers, which can be unclear.

This patch adds a new cpumask_any_and_but() to cater for these cases.

Signed-off-by: Mark Rutland <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: [email protected]
Signed-off-by: Dawei Li <[email protected]>
---
include/linux/cpumask.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 1c29947db848..121f3ac757ff 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -388,6 +388,29 @@ unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
return i;
}

+/**
+ * cpumask_any_and_but - pick a "random" cpu from *mask1 & *mask2, but not this one.
+ * @mask1: the first input cpumask
+ * @mask2: the second input cpumask
+ * @cpu: the cpu to ignore
+ *
+ * Returns >= nr_cpu_ids if no cpus set.
+ */
+static inline
+unsigned int cpumask_any_and_but(const struct cpumask *mask1,
+ const struct cpumask *mask2,
+ unsigned int cpu)
+{
+ unsigned int i;
+
+ cpumask_check(cpu);
+ i = cpumask_first_and(mask1, mask2);
+ if (i != cpu)
+ return i;
+
+ return cpumask_next_and(cpu, mask1, mask2);
+}
+
/**
* cpumask_nth - get the Nth cpu in a cpumask
* @srcp: the cpumask pointer
--
2.27.0


2024-04-03 16:09:04

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 07/10] perf/hisi_pcie: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/hisilicon/hisi_pcie_pmu.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index 5d1f0e9fdb08..06b192cc31d5 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -673,7 +673,6 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
{
struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
unsigned int target;
- cpumask_t mask;
int numa_node;

/* Nothing to do if this CPU doesn't own the PMU */
@@ -684,10 +683,10 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)

/* Choose a local CPU from all online cpus. */
numa_node = dev_to_node(&pcie_pmu->pdev->dev);
- if (cpumask_and(&mask, cpumask_of_node(numa_node), cpu_online_mask) &&
- cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
- target = cpumask_any(&mask);
- else
+
+ target = cpumask_any_and_but(cpumask_of_node(numa_node),
+ cpu_online_mask, cpu);
+ if (target >= nr_cpu_ids)
target = cpumask_any_but(cpu_online_mask, cpu);

if (target >= nr_cpu_ids) {
--
2.27.0


2024-04-03 16:12:19

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 08/10] perf/hisi_uncore: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/hisilicon/hisi_uncore_pmu.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index 04031450d5fe..ccc9191ad1b6 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -504,7 +504,6 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
{
struct hisi_pmu *hisi_pmu = hlist_entry_safe(node, struct hisi_pmu,
node);
- cpumask_t pmu_online_cpus;
unsigned int target;

if (!cpumask_test_and_clear_cpu(cpu, &hisi_pmu->associated_cpus))
@@ -518,9 +517,8 @@ int hisi_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
hisi_pmu->on_cpu = -1;

/* Choose a new CPU to migrate ownership of the PMU to */
- cpumask_and(&pmu_online_cpus, &hisi_pmu->associated_cpus,
- cpu_online_mask);
- target = cpumask_any_but(&pmu_online_cpus, cpu);
+ target = cpumask_any_and_but(&hisi_pmu->associated_cpus,
+ cpu_online_mask, cpu);
if (target >= nr_cpu_ids)
return 0;

--
2.27.0


2024-04-03 16:31:46

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v3 01/10] cpumask: add cpumask_any_and_but()

On Wed, Apr 03, 2024 at 11:59:41PM +0800, Dawei Li wrote:
> From: Mark Rutland <[email protected]>
>
> In some cases, it's useful to be able to select a random cpu from the
> intersection of two masks, excluding a particular CPU.
>
> For example, in some systems an uncore PMU is shared by a subset of
> CPUs, and management of this PMU is assigned to some arbitrary CPU in
> this set. Whenever the management CPU is hotplugged out, we wish to
> migrate responsibility to another arbitrary CPU which is both in this
> set and online.
>
> Today we can use cpumask_any_and() to select an arbitrary CPU in the
> intersection of two masks. We can also use cpumask_any_but() to select
> any arbitrary cpu in a mask excluding, a particular CPU.
>
> To do both, we either need to use a temporary cpumask, which is
> wasteful, or use some lower-level cpumask helpers, which can be unclear.
>
> This patch adds a new cpumask_any_and_but() to cater for these cases.
>
> Signed-off-by: Mark Rutland <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Rusty Russell <[email protected]>
> Cc: [email protected]
> Signed-off-by: Dawei Li <[email protected]>

Thank you,

Acked-by: Yury Norov <[email protected]>

2024-04-03 16:37:37

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3 06/10] perf/dwc_pcie: Avoid placing cpumask on the stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
the stack.

Suggested-by: Mark Rutland <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
---
drivers/perf/dwc_pcie_pmu.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
index 957058ad0099..c5e328f23841 100644
--- a/drivers/perf/dwc_pcie_pmu.c
+++ b/drivers/perf/dwc_pcie_pmu.c
@@ -690,9 +690,8 @@ static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_n
{
struct dwc_pcie_pmu *pcie_pmu;
struct pci_dev *pdev;
- int node;
- cpumask_t mask;
unsigned int target;
+ int node;

pcie_pmu = hlist_entry_safe(cpuhp_node, struct dwc_pcie_pmu, cpuhp_node);
/* Nothing to do if this CPU doesn't own the PMU */
@@ -702,10 +701,9 @@ static int dwc_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_n
pcie_pmu->on_cpu = -1;
pdev = pcie_pmu->pdev;
node = dev_to_node(&pdev->dev);
- if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
- cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
- target = cpumask_any(&mask);
- else
+
+ target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
+ if (target >= nr_cpu_ids)
target = cpumask_any_but(cpu_online_mask, cpu);

if (target >= nr_cpu_ids) {
--
2.27.0


2024-04-08 00:28:47

by Shuai Xue

[permalink] [raw]
Subject: Re: [PATCH v3 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask on the stack



On 2024/4/3 23:59, Dawei Li wrote:
> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
>
> Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
> the stack.
>
> Suggested-by: Mark Rutland <[email protected]>
> Reviewed-by: Mark Rutland <[email protected]>
> Signed-off-by: Dawei Li <[email protected]>
> ---

Reviewed-by: Shuai Xue <[email protected]>

2024-04-08 00:30:52

by Shuai Xue

[permalink] [raw]
Subject: Re: [PATCH v3 06/10] perf/dwc_pcie: Avoid placing cpumask on the stack



On 2024/4/3 23:59, Dawei Li wrote:
> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
>
> Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
> the stack.
>
> Suggested-by: Mark Rutland <[email protected]>
> Reviewed-by: Mark Rutland <[email protected]>
> Signed-off-by: Dawei Li <[email protected]>

Reviewed-by: Shuai Xue <[email protected]>

Thank you.

Shuai

2024-04-09 17:30:06

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 00/10] perf: Avoid placing cpumask on the stack

On Wed, 03 Apr 2024 23:59:40 +0800, Dawei Li wrote:
> This is v3 of previous series on elimination of placing cpumask var on
> stack.
>
> Change since v2:
> - Update commit message for patch[2-10]. [Mark]
> - Add Reviewed-by from Mark for patch[2-10].
>
> [...]

Applied to will (for-next/perf), thanks!

[01/10] cpumask: add cpumask_any_and_but()
https://git.kernel.org/will/c/897fa2c38c07
[02/10] perf/alibaba_uncore_drw: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/2f6589df124e
[03/10] perf/arm-cmn: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/60c73240f304
[04/10] perf/arm_cspmu: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/b5310fa1fe8e
[05/10] perf/arm_dsu: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/bea2a13b207e
[06/10] perf/dwc_pcie: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/cf276ee46bc4
[07/10] perf/hisi_pcie: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/d7df79e6af29
[08/10] perf/hisi_uncore: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/b78d0fa25462
[09/10] perf/qcom_l2: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/fc85cee97029
[10/10] perf/thunderx2: Avoid placing cpumask on the stack
https://git.kernel.org/will/c/595275ca4984

Cheers,
--
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

2024-04-05 09:57:16

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v3 08/10] perf/hisi_uncore: Avoid placing cpumask on the stack

On Wed, 3 Apr 2024 23:59:48 +0800
Dawei Li <[email protected]> wrote:

> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
>
> Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
> the stack.
>
> Suggested-by: Mark Rutland <[email protected]>
> Reviewed-by: Mark Rutland <[email protected]>
> Signed-off-by: Dawei Li <[email protected]>
Acked-by: Jonathan Cameron <[email protected]>

2024-04-05 09:56:46

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v3 07/10] perf/hisi_pcie: Avoid placing cpumask on the stack

On Wed, 3 Apr 2024 23:59:47 +0800
Dawei Li <[email protected]> wrote:

> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
>
> Use cpumask_any_and_but() to avoid the need for a temporary cpumask on
> the stack.
>
> Suggested-by: Mark Rutland <[email protected]>
> Reviewed-by: Mark Rutland <[email protected]>
> Signed-off-by: Dawei Li <[email protected]>
Acked-by: Jonathan Cameron <[email protected]>