On Wed, 2016-02-03 at 01:11 +0530, Shilpasri G Bhat wrote:
> cpu_to_chip_id() does a DT walk through to find out the chip id by
> taking a contended device tree lock. This adds an unnecessary overhead
> in a hot path. So instead of calling cpu_to_chip_id() everytime cache
> the chip ids for all cores in the array 'core_to_chip_map' and use it
> in the hotpath.
>
> Reported-by: Anton Blanchard <[email protected]>
> Signed-off-by: Shilpasri G Bhat <[email protected]>
> Reviewed-by: Gautham R. Shenoy <[email protected]>
> Acked-by: Viresh Kumar <[email protected]>
> ---
> No changes from v7.
How about this instead? It removes the linear lookup and seems a lot
less complex.
Mikey
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 547890f..d63d2cb 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -52,6 +52,7 @@ static struct chip {
} *chips;
static int nr_chips;
+static DEFINE_PER_CPU(unsigned int, chip_id);
/*
* Note: The set of pstates consists of contiguous integers, the
@@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void *data)
pmsr = get_pmspr(SPRN_PMSR);
- for (i = 0; i < nr_chips; i++)
- if (chips[i].id == cpu_to_chip_id(cpu))
- break;
+ i = this_cpu_read(chip_id);
/* Check for Pmax Capping */
pmsr_pmax = (s8)PMSR_MAX(pmsr);
@@ -560,6 +559,7 @@ static int init_chip_info(void)
for_each_possible_cpu(cpu) {
unsigned int id = cpu_to_chip_id(cpu);
+ per_cpu(chip_id, cpu) = nr_chips;
if (prev_chip_id != id) {
prev_chip_id = id;
chip[nr_chips++] = id;
On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:
> On Wed, 2016-02-03 at 01:11 +0530, Shilpasri G Bhat wrote:
>
> > cpu_to_chip_id() does a DT walk through to find out the chip id by
> > taking a contended device tree lock. This adds an unnecessary
> > overhead
> > in a hot path. So instead of calling cpu_to_chip_id() everytime
> > cache
> > the chip ids for all cores in the array 'core_to_chip_map' and use
> > it
> > in the hotpath.
> >
> > Reported-by: Anton Blanchard <[email protected]>
> > Signed-off-by: Shilpasri G Bhat <[email protected]>
> > Reviewed-by: Gautham R. Shenoy <[email protected]>
> > Acked-by: Viresh Kumar <[email protected]>
> > ---
> > No changes from v7.
>
> How about this instead? It removes the linear lookup and seems a lot
> less complex.
BTW we never init nr_chips before using it. We also need something
like.
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index d63d2cb..c819ed4 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -556,6 +556,8 @@ static int init_chip_info(void)
unsigned int cpu, i;
unsigned int prev_chip_id = UINT_MAX;
+ nr_chips = 0;
+
for_each_possible_cpu(cpu) {
unsigned int id = cpu_to_chip_id(cpu);
On Fri, Mar 18, 2016 at 5:11 AM, Michael Neuling <[email protected]> wrote:
> On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:
>
>> On Wed, 2016-02-03 at 01:11 +0530, Shilpasri G Bhat wrote:
>>
>
>> > cpu_to_chip_id() does a DT walk through to find out the chip id by
>> > taking a contended device tree lock. This adds an unnecessary
>> > overhead
>> > in a hot path. So instead of calling cpu_to_chip_id() everytime
>> > cache
>> > the chip ids for all cores in the array 'core_to_chip_map' and use
>> > it
>> > in the hotpath.
>> >
>> > Reported-by: Anton Blanchard <[email protected]>
>> > Signed-off-by: Shilpasri G Bhat <[email protected]>
>> > Reviewed-by: Gautham R. Shenoy <[email protected]>
>> > Acked-by: Viresh Kumar <[email protected]>
>> > ---
>> > No changes from v7.
>>
>> How about this instead? It removes the linear lookup and seems a lot
>> less complex.
This has gone in already. Can you please send a patch on top of it?
> BTW we never init nr_chips before using it. We also need something
> like.
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index d63d2cb..c819ed4 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -556,6 +556,8 @@ static int init_chip_info(void)
> unsigned int cpu, i;
> unsigned int prev_chip_id = UINT_MAX;
>
> + nr_chips = 0;
> +
> for_each_possible_cpu(cpu) {
> unsigned int id = cpu_to_chip_id(cpu);
>
>
Including this part too maybe?
Thanks,
Rafael
From: Michael Neuling <[email protected]>
"cpufreq: powernv: Remove cpu_to_chip_id() from hot-path" introduced
'core_to_chip_map' array to cache the chip-id of all cores. Replace
this with per_cpu variable that stores the pointer to the chip-array.
This removes the linear lookup and provides a neater and simpler
solution.
Signed-off-by: Michael Neuling <[email protected]>
Tested-by: Shilpasri G Bhat <[email protected]>
---
- Rebased the patch on top of linux-pm/linux-next
- nr_chips is defined static, so it will be initialized to zero
- Moved the initialization of the per_cpu variable after 'chips' is
allocated
- Removed 'core_to_chip_map'
drivers/cpufreq/powernv-cpufreq.c | 50 +++++++++++++--------------------------
1 file changed, 17 insertions(+), 33 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 50bf120..a00bcc2 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -44,7 +44,6 @@
static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
static bool rebooting, throttled, occ_reset;
-static unsigned int *core_to_chip_map;
static const char * const throttle_reason[] = {
"No throttling",
@@ -65,6 +64,7 @@ static struct chip {
} *chips;
static int nr_chips;
+static DEFINE_PER_CPU(struct chip *, chip_info);
/*
* Note: The set of pstates consists of contiguous integers, the
@@ -324,34 +324,31 @@ static inline unsigned int get_nominal_index(void)
static void powernv_cpufreq_throttle_check(void *data)
{
+ struct chip *chip;
unsigned int cpu = smp_processor_id();
- unsigned int chip_id = core_to_chip_map[cpu_core_index_of_thread(cpu)];
unsigned long pmsr;
- int pmsr_pmax, i;
+ int pmsr_pmax;
pmsr = get_pmspr(SPRN_PMSR);
-
- for (i = 0; i < nr_chips; i++)
- if (chips[i].id == chip_id)
- break;
+ chip = this_cpu_read(chip_info);
/* Check for Pmax Capping */
pmsr_pmax = (s8)PMSR_MAX(pmsr);
if (pmsr_pmax != powernv_pstate_info.max) {
- if (chips[i].throttled)
+ if (chip->throttled)
goto next;
- chips[i].throttled = true;
+ chip->throttled = true;
if (pmsr_pmax < powernv_pstate_info.nominal)
pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
- cpu, chips[i].id, pmsr_pmax,
+ cpu, chip->id, pmsr_pmax,
powernv_pstate_info.nominal);
- trace_powernv_throttle(chips[i].id,
- throttle_reason[chips[i].throttle_reason],
+ trace_powernv_throttle(chip->id,
+ throttle_reason[chip->throttle_reason],
pmsr_pmax);
- } else if (chips[i].throttled) {
- chips[i].throttled = false;
- trace_powernv_throttle(chips[i].id,
- throttle_reason[chips[i].throttle_reason],
+ } else if (chip->throttled) {
+ chip->throttled = false;
+ trace_powernv_throttle(chip->id,
+ throttle_reason[chip->throttle_reason],
pmsr_pmax);
}
@@ -558,47 +555,34 @@ static int init_chip_info(void)
unsigned int chip[256];
unsigned int cpu, i;
unsigned int prev_chip_id = UINT_MAX;
- cpumask_t cpu_mask;
- int ret = -ENOMEM;
-
- core_to_chip_map = kcalloc(cpu_nr_cores(), sizeof(unsigned int),
- GFP_KERNEL);
- if (!core_to_chip_map)
- goto out;
- cpumask_copy(&cpu_mask, cpu_possible_mask);
- for_each_cpu(cpu, &cpu_mask) {
+ for_each_possible_cpu(cpu) {
unsigned int id = cpu_to_chip_id(cpu);
if (prev_chip_id != id) {
prev_chip_id = id;
chip[nr_chips++] = id;
}
- core_to_chip_map[cpu_core_index_of_thread(cpu)] = id;
- cpumask_andnot(&cpu_mask, &cpu_mask, cpu_sibling_mask(cpu));
}
chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL);
if (!chips)
- goto free_chip_map;
+ return -ENOMEM;
for (i = 0; i < nr_chips; i++) {
chips[i].id = chip[i];
cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
+ for_each_cpu(cpu, &chips[i].mask)
+ per_cpu(chip_info, cpu) = &chips[i];
}
return 0;
-free_chip_map:
- kfree(core_to_chip_map);
-out:
- return ret;
}
static inline void clean_chip_info(void)
{
kfree(chips);
- kfree(core_to_chip_map);
}
static inline void unregister_all_notifiers(void)
--
1.9.3
On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:
>
> static int nr_chips;
> +static DEFINE_PER_CPU(unsigned int, chip_id);
>
> /*
> * Note: The set of pstates consists of contiguous integers, the
> @@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void
> *data)
>
> pmsr = get_pmspr(SPRN_PMSR);
>
> - for (i = 0; i < nr_chips; i++)
> - if (chips[i].id == cpu_to_chip_id(cpu))
> - break;
> + i = this_cpu_read(chip_id);
Except it's not a chip_id, so your patch confused me for a good 2mn ...
Call it chip_idx maybe ? ie, index.
Cheers,
Ben.
> /* Check for Pmax Capping */
> pmsr_pmax = (s8)PMSR_MAX(pmsr);
> @@ -560,6 +559,7 @@ static int init_chip_info(void)
> for_each_possible_cpu(cpu) {
> unsigned int id = cpu_to_chip_id(cpu);
>
> + per_cpu(chip_id, cpu) = nr_chips;
> if (prev_chip_id != id) {
> prev_chip_id = id;
> chip[nr_chips++] = id;
> _______________________________________________
On Sat, 2016-03-19 at 09:37 +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:
> >
> > static int nr_chips;
> > +static DEFINE_PER_CPU(unsigned int, chip_id);
> >
> > /*
> > * Note: The set of pstates consists of contiguous integers, the
> > @@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void
> > *data)
> >
> > pmsr = get_pmspr(SPRN_PMSR);
> >
> > - for (i = 0; i < nr_chips; i++)
> > - if (chips[i].id == cpu_to_chip_id(cpu))
> > - break;
> > + i = this_cpu_read(chip_id);
>
> Except it's not a chip_id, so your patch confused me for a good 2mn
> ...
> Call it chip_idx maybe ? ie, index.
Yeah, it was a badly named variable but I changed it even more and
Shilpasri rebased it here:
http://patchwork.ozlabs.org/patch/599523/
Mikey
On 18-03-16, 20:28, Shilpasri G Bhat wrote:
> From: Michael Neuling <[email protected]>
>
> "cpufreq: powernv: Remove cpu_to_chip_id() from hot-path" introduced
If the patch is already committed, you should provide its commit id as well.
> 'core_to_chip_map' array to cache the chip-id of all cores. Replace
> this with per_cpu variable that stores the pointer to the chip-array.
> This removes the linear lookup and provides a neater and simpler
> solution.
>
> Signed-off-by: Michael Neuling <[email protected]>
> Tested-by: Shilpasri G Bhat <[email protected]>
You are sending this patch and you have updated it as well.. So, you should have
had your signed-off here.
Acked-by: Viresh Kumar <[email protected]>
--
viresh
On Monday, March 21, 2016 12:52:55 PM Viresh Kumar wrote:
> On 18-03-16, 20:28, Shilpasri G Bhat wrote:
> > From: Michael Neuling <[email protected]>
> >
> > "cpufreq: powernv: Remove cpu_to_chip_id() from hot-path" introduced
>
> If the patch is already committed, you should provide its commit id as well.
>
> > 'core_to_chip_map' array to cache the chip-id of all cores. Replace
> > this with per_cpu variable that stores the pointer to the chip-array.
> > This removes the linear lookup and provides a neater and simpler
> > solution.
> >
> > Signed-off-by: Michael Neuling <[email protected]>
> > Tested-by: Shilpasri G Bhat <[email protected]>
>
> You are sending this patch and you have updated it as well.. So, you should have
> had your signed-off here.
Right. Plus if you send a patch from anyone else, you should add your
sigh-off to it anyway, even if it hasn't been modified.
So Shilpasri, please resend with your S-o-b and with the ACK from Viresh.
Thanks,
Rafael