2016-03-18 04:04:28

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

On Wed, 2016-02-03 at 01:11 +0530, Shilpasri G Bhat wrote:

> cpu_to_chip_id() does a DT walk through to find out the chip id by
> taking a contended device tree lock. This adds an unnecessary overhead
> in a hot path. So instead of calling cpu_to_chip_id() everytime cache
> the chip ids for all cores in the array 'core_to_chip_map' and use it
> in the hotpath.
>
> Reported-by: Anton Blanchard <[email protected]>
> Signed-off-by: Shilpasri G Bhat <[email protected]>
> Reviewed-by: Gautham R. Shenoy <[email protected]>
> Acked-by: Viresh Kumar <[email protected]>
> ---
> No changes from v7.

How about this instead? It removes the linear lookup and seems a lot
less complex.

Mikey

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 547890f..d63d2cb 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -52,6 +52,7 @@ static struct chip {
} *chips;

static int nr_chips;
+static DEFINE_PER_CPU(unsigned int, chip_id);

/*
* Note: The set of pstates consists of contiguous integers, the
@@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void *data)

pmsr = get_pmspr(SPRN_PMSR);

- for (i = 0; i < nr_chips; i++)
- if (chips[i].id == cpu_to_chip_id(cpu))
- break;
+ i = this_cpu_read(chip_id);

/* Check for Pmax Capping */
pmsr_pmax = (s8)PMSR_MAX(pmsr);
@@ -560,6 +559,7 @@ static int init_chip_info(void)
for_each_possible_cpu(cpu) {
unsigned int id = cpu_to_chip_id(cpu);

+ per_cpu(chip_id, cpu) = nr_chips;
if (prev_chip_id != id) {
prev_chip_id = id;
chip[nr_chips++] = id;


2016-03-18 04:12:00

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:

> On Wed, 2016-02-03 at 01:11 +0530, Shilpasri G Bhat wrote:
>

> > cpu_to_chip_id() does a DT walk through to find out the chip id by
> > taking a contended device tree lock. This adds an unnecessary
> > overhead
> > in a hot path. So instead of calling cpu_to_chip_id() everytime
> > cache
> > the chip ids for all cores in the array 'core_to_chip_map' and use
> > it
> > in the hotpath.
> >
> > Reported-by: Anton Blanchard <[email protected]>
> > Signed-off-by: Shilpasri G Bhat <[email protected]>
> > Reviewed-by: Gautham R. Shenoy <[email protected]>
> > Acked-by: Viresh Kumar <[email protected]>
> > ---
> > No changes from v7.
>
> How about this instead? It removes the linear lookup and seems a lot
> less complex.

BTW we never init nr_chips before using it. We also need something
like.

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index d63d2cb..c819ed4 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -556,6 +556,8 @@ static int init_chip_info(void)
unsigned int cpu, i;
unsigned int prev_chip_id = UINT_MAX;

+ nr_chips = 0;
+
for_each_possible_cpu(cpu) {
unsigned int id = cpu_to_chip_id(cpu);



2016-03-18 13:14:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

On Fri, Mar 18, 2016 at 5:11 AM, Michael Neuling <[email protected]> wrote:
> On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:
>
>> On Wed, 2016-02-03 at 01:11 +0530, Shilpasri G Bhat wrote:
>>
>
>> > cpu_to_chip_id() does a DT walk through to find out the chip id by
>> > taking a contended device tree lock. This adds an unnecessary
>> > overhead
>> > in a hot path. So instead of calling cpu_to_chip_id() everytime
>> > cache
>> > the chip ids for all cores in the array 'core_to_chip_map' and use
>> > it
>> > in the hotpath.
>> >
>> > Reported-by: Anton Blanchard <[email protected]>
>> > Signed-off-by: Shilpasri G Bhat <[email protected]>
>> > Reviewed-by: Gautham R. Shenoy <[email protected]>
>> > Acked-by: Viresh Kumar <[email protected]>
>> > ---
>> > No changes from v7.
>>
>> How about this instead? It removes the linear lookup and seems a lot
>> less complex.

This has gone in already. Can you please send a patch on top of it?

> BTW we never init nr_chips before using it. We also need something
> like.
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index d63d2cb..c819ed4 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -556,6 +556,8 @@ static int init_chip_info(void)
> unsigned int cpu, i;
> unsigned int prev_chip_id = UINT_MAX;
>
> + nr_chips = 0;
> +
> for_each_possible_cpu(cpu) {
> unsigned int id = cpu_to_chip_id(cpu);
>
>

Including this part too maybe?

Thanks,
Rafael

2016-03-18 14:59:32

by Shilpasri G Bhat

[permalink] [raw]
Subject: [PATCH] cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path

From: Michael Neuling <[email protected]>

"cpufreq: powernv: Remove cpu_to_chip_id() from hot-path" introduced
'core_to_chip_map' array to cache the chip-id of all cores. Replace
this with per_cpu variable that stores the pointer to the chip-array.
This removes the linear lookup and provides a neater and simpler
solution.

Signed-off-by: Michael Neuling <[email protected]>
Tested-by: Shilpasri G Bhat <[email protected]>
---

- Rebased the patch on top of linux-pm/linux-next
- nr_chips is defined static, so it will be initialized to zero
- Moved the initialization of the per_cpu variable after 'chips' is
allocated
- Removed 'core_to_chip_map'

drivers/cpufreq/powernv-cpufreq.c | 50 +++++++++++++--------------------------
1 file changed, 17 insertions(+), 33 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 50bf120..a00bcc2 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -44,7 +44,6 @@

static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
static bool rebooting, throttled, occ_reset;
-static unsigned int *core_to_chip_map;

static const char * const throttle_reason[] = {
"No throttling",
@@ -65,6 +64,7 @@ static struct chip {
} *chips;

static int nr_chips;
+static DEFINE_PER_CPU(struct chip *, chip_info);

/*
* Note: The set of pstates consists of contiguous integers, the
@@ -324,34 +324,31 @@ static inline unsigned int get_nominal_index(void)

static void powernv_cpufreq_throttle_check(void *data)
{
+ struct chip *chip;
unsigned int cpu = smp_processor_id();
- unsigned int chip_id = core_to_chip_map[cpu_core_index_of_thread(cpu)];
unsigned long pmsr;
- int pmsr_pmax, i;
+ int pmsr_pmax;

pmsr = get_pmspr(SPRN_PMSR);
-
- for (i = 0; i < nr_chips; i++)
- if (chips[i].id == chip_id)
- break;
+ chip = this_cpu_read(chip_info);

/* Check for Pmax Capping */
pmsr_pmax = (s8)PMSR_MAX(pmsr);
if (pmsr_pmax != powernv_pstate_info.max) {
- if (chips[i].throttled)
+ if (chip->throttled)
goto next;
- chips[i].throttled = true;
+ chip->throttled = true;
if (pmsr_pmax < powernv_pstate_info.nominal)
pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
- cpu, chips[i].id, pmsr_pmax,
+ cpu, chip->id, pmsr_pmax,
powernv_pstate_info.nominal);
- trace_powernv_throttle(chips[i].id,
- throttle_reason[chips[i].throttle_reason],
+ trace_powernv_throttle(chip->id,
+ throttle_reason[chip->throttle_reason],
pmsr_pmax);
- } else if (chips[i].throttled) {
- chips[i].throttled = false;
- trace_powernv_throttle(chips[i].id,
- throttle_reason[chips[i].throttle_reason],
+ } else if (chip->throttled) {
+ chip->throttled = false;
+ trace_powernv_throttle(chip->id,
+ throttle_reason[chip->throttle_reason],
pmsr_pmax);
}

@@ -558,47 +555,34 @@ static int init_chip_info(void)
unsigned int chip[256];
unsigned int cpu, i;
unsigned int prev_chip_id = UINT_MAX;
- cpumask_t cpu_mask;
- int ret = -ENOMEM;
-
- core_to_chip_map = kcalloc(cpu_nr_cores(), sizeof(unsigned int),
- GFP_KERNEL);
- if (!core_to_chip_map)
- goto out;

- cpumask_copy(&cpu_mask, cpu_possible_mask);
- for_each_cpu(cpu, &cpu_mask) {
+ for_each_possible_cpu(cpu) {
unsigned int id = cpu_to_chip_id(cpu);

if (prev_chip_id != id) {
prev_chip_id = id;
chip[nr_chips++] = id;
}
- core_to_chip_map[cpu_core_index_of_thread(cpu)] = id;
- cpumask_andnot(&cpu_mask, &cpu_mask, cpu_sibling_mask(cpu));
}

chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL);
if (!chips)
- goto free_chip_map;
+ return -ENOMEM;

for (i = 0; i < nr_chips; i++) {
chips[i].id = chip[i];
cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
+ for_each_cpu(cpu, &chips[i].mask)
+ per_cpu(chip_info, cpu) = &chips[i];
}

return 0;
-free_chip_map:
- kfree(core_to_chip_map);
-out:
- return ret;
}

static inline void clean_chip_info(void)
{
kfree(chips);
- kfree(core_to_chip_map);
}

static inline void unregister_all_notifiers(void)
--
1.9.3

2016-03-18 22:38:25

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:
>
>  static int nr_chips;
> +static DEFINE_PER_CPU(unsigned int, chip_id);
>  
>  /*
>   * Note: The set of pstates consists of contiguous integers, the
> @@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void
> *data)
>  
>         pmsr = get_pmspr(SPRN_PMSR);
>  
> -       for (i = 0; i < nr_chips; i++)
> -               if (chips[i].id == cpu_to_chip_id(cpu))
> -                       break;
> +       i = this_cpu_read(chip_id);

Except it's not a chip_id, so your patch confused me for a good 2mn ...
Call it chip_idx maybe ? ie, index.

Cheers,
Ben.

>         /* Check for Pmax Capping */
>         pmsr_pmax = (s8)PMSR_MAX(pmsr);
> @@ -560,6 +559,7 @@ static int init_chip_info(void)
>         for_each_possible_cpu(cpu) {
>                 unsigned int id = cpu_to_chip_id(cpu);
>  
> +               per_cpu(chip_id, cpu) = nr_chips;
>                 if (prev_chip_id != id) {
>                         prev_chip_id = id;
>                         chip[nr_chips++] = id;
> _______________________________________________

2016-03-18 23:20:33

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path

On Sat, 2016-03-19 at 09:37 +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2016-03-18 at 15:04 +1100, Michael Neuling wrote:
> >
> > static int nr_chips;
> > +static DEFINE_PER_CPU(unsigned int, chip_id);
> >
> > /*
> > * Note: The set of pstates consists of contiguous integers, the
> > @@ -317,9 +318,7 @@ static void powernv_cpufreq_throttle_check(void
> > *data)
> >
> > pmsr = get_pmspr(SPRN_PMSR);
> >
> > - for (i = 0; i < nr_chips; i++)
> > - if (chips[i].id == cpu_to_chip_id(cpu))
> > - break;
> > + i = this_cpu_read(chip_id);
>
> Except it's not a chip_id, so your patch confused me for a good 2mn
> ...
> Call it chip_idx maybe ? ie, index.

Yeah, it was a badly named variable but I changed it even more and
Shilpasri rebased it here:

http://patchwork.ozlabs.org/patch/599523/

Mikey

2016-03-21 07:23:08

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path

On 18-03-16, 20:28, Shilpasri G Bhat wrote:
> From: Michael Neuling <[email protected]>
>
> "cpufreq: powernv: Remove cpu_to_chip_id() from hot-path" introduced

If the patch is already committed, you should provide its commit id as well.

> 'core_to_chip_map' array to cache the chip-id of all cores. Replace
> this with per_cpu variable that stores the pointer to the chip-array.
> This removes the linear lookup and provides a neater and simpler
> solution.
>
> Signed-off-by: Michael Neuling <[email protected]>
> Tested-by: Shilpasri G Bhat <[email protected]>

You are sending this patch and you have updated it as well.. So, you should have
had your signed-off here.

Acked-by: Viresh Kumar <[email protected]>

--
viresh

2016-03-21 14:11:54

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: powernv: Define per_cpu chip pointer to optimize hot-path

On Monday, March 21, 2016 12:52:55 PM Viresh Kumar wrote:
> On 18-03-16, 20:28, Shilpasri G Bhat wrote:
> > From: Michael Neuling <[email protected]>
> >
> > "cpufreq: powernv: Remove cpu_to_chip_id() from hot-path" introduced
>
> If the patch is already committed, you should provide its commit id as well.
>
> > 'core_to_chip_map' array to cache the chip-id of all cores. Replace
> > this with per_cpu variable that stores the pointer to the chip-array.
> > This removes the linear lookup and provides a neater and simpler
> > solution.
> >
> > Signed-off-by: Michael Neuling <[email protected]>
> > Tested-by: Shilpasri G Bhat <[email protected]>
>
> You are sending this patch and you have updated it as well.. So, you should have
> had your signed-off here.

Right. Plus if you send a patch from anyone else, you should add your
sigh-off to it anyway, even if it hasn't been modified.

So Shilpasri, please resend with your S-o-b and with the ACK from Viresh.

Thanks,
Rafael