2023-10-14 10:55:15

by Sumit Gupta

[permalink] [raw]
Subject: [Patch v5 0/2] Add support for _TFP and change throttle pctg

This patch set adds two improvements to get a finer control over the
impact of thermal throttling on performance. Requesting to merge the
patches if no further changes needed.

1) Patch 1: Adds support to read "Thermal fast Sampling Period (_TFP)"
ACPI object and use it over "Thermal Sampling Period (_TSP)" for
Passive cooling if both are present.

2) Patch 2: Adds support to reduce the CPUFREQ reduction percentage
and not always cause throttling in steps of "20%" for Tegra241 SoC.

Both patches can be applied independently.

---
v4[4] -> 5:
- Patch 2: fix kernel robot warning for acpi_thermal_cpufreq_pctg().

v3[3] -> 4:
- Patch 2: move ARM code from generic to new file 'thermal_cpufreq.c'.
: get 'cpufreq_thermal_pctg' value for Tegra241 from new file.
: move dummy/null function to 'acpi.h'.

v2[2] -> v3:
- Patch1: rebased on top of linux-next.
- Patch2: use __read_mostly for the cpufreq_thermal_* variables.
: add static to new function acpi_thermal_cpufreq_config_nvidia.
: add null function if CONFIG_HAVE_ARM_SMCCC_DISCOVERY undefined
: removed redundant parenthesis.

v1[1] -> v2:
- Patch1: add ACPI spec section info in commit description and rebased.
- Patch2: add info about hardware in the commit description.
: switched CPUFREQ THERMAL tuning macros to static variables.
: update the tunings for Tegra241 SoC only using soc_id check.

Jeff Brasen (1):
ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support

Srikar Srimath Tirumala (1):
ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241

drivers/acpi/arm64/Makefile | 1 +
drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++++++
drivers/acpi/processor_thermal.c | 35 +++++++++++++++++++++++++---
drivers/acpi/thermal.c | 17 +++++++++-----
include/linux/acpi.h | 9 +++++++
5 files changed, 73 insertions(+), 9 deletions(-)
create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c

[4] https://lore.kernel.org/lkml/[email protected]/
[3] https://lore.kernel.org/linux-acpi/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/
[1] https://lore.kernel.org/lkml/[email protected]/

--
2.17.1


2023-10-14 10:55:18

by Sumit Gupta

[permalink] [raw]
Subject: [Patch v5 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241

From: Srikar Srimath Tirumala <[email protected]>

Current implementation of processor_thermal performs software throttling
in fixed steps of "20%" which can be too coarse for some platforms.
We observed some performance gain after reducing the throttle percentage.
Change the CPUFREQ thermal reduction percentage and maximum thermal steps
to be configurable. Also, update the default values of both for Nvidia
Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%"
and accordingly the maximum number of thermal steps are increased as they
are derived from the reduction percentage.

Signed-off-by: Srikar Srimath Tirumala <[email protected]>
Co-developed-by: Sumit Gupta <[email protected]>
Signed-off-by: Sumit Gupta <[email protected]>
---
drivers/acpi/arm64/Makefile | 1 +
drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++++++
drivers/acpi/processor_thermal.c | 35 +++++++++++++++++++++++++---
include/linux/acpi.h | 9 +++++++
4 files changed, 62 insertions(+), 3 deletions(-)
create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c

diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
index 143debc1ba4a..3f181d8156cc 100644
--- a/drivers/acpi/arm64/Makefile
+++ b/drivers/acpi/arm64/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
obj-$(CONFIG_ACPI_APMT) += apmt.o
obj-$(CONFIG_ARM_AMBA) += amba.o
obj-y += dma.o init.o
+obj-$(CONFIG_ACPI) += thermal_cpufreq.o
diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c
new file mode 100644
index 000000000000..de834fb013e7
--- /dev/null
+++ b/drivers/acpi/arm64/thermal_cpufreq.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/acpi.h>
+
+#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
+#define SMCCC_SOC_ID_T241 0x036b0241
+
+int acpi_thermal_cpufreq_pctg(void)
+{
+ s32 soc_id = arm_smccc_get_soc_id_version();
+
+ /*
+ * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and
+ * reduce the CPUFREQ Thermal reduction percentage to 5%.
+ */
+ if (soc_id == SMCCC_SOC_ID_T241)
+ return 5;
+
+ return 0;
+}
+#endif
diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c
index b7c6287eccca..52f316e4e260 100644
--- a/drivers/acpi/processor_thermal.c
+++ b/drivers/acpi/processor_thermal.c
@@ -26,7 +26,16 @@
*/

#define CPUFREQ_THERMAL_MIN_STEP 0
-#define CPUFREQ_THERMAL_MAX_STEP 3
+
+static int cpufreq_thermal_max_step __read_mostly = 3;
+
+/*
+ * Minimum throttle percentage for processor_thermal cooling device.
+ * The processor_thermal driver uses it to calculate the percentage amount by
+ * which cpu frequency must be reduced for each cooling state. This is also used
+ * to calculate the maximum number of throttling steps or cooling states.
+ */
+static int cpufreq_thermal_pctg __read_mostly = 20;

static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg);

@@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu)
if (!cpu_has_cpufreq(cpu))
return 0;

- return CPUFREQ_THERMAL_MAX_STEP;
+ return cpufreq_thermal_max_step;
}

static int cpufreq_get_cur_state(unsigned int cpu)
@@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
if (!policy)
return -EINVAL;

- max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100;
+ max_freq = (policy->cpuinfo.max_freq *
+ (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100;

cpufreq_cpu_put(policy);

@@ -126,10 +136,29 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
return 0;
}

+static void acpi_thermal_cpufreq_config(void)
+{
+ int cpufreq_pctg = acpi_thermal_cpufreq_pctg();
+
+ if (!cpufreq_pctg)
+ return;
+
+ cpufreq_thermal_pctg = cpufreq_pctg;
+
+ /*
+ * Derive the MAX_STEP from minimum throttle percentage so that the reduction
+ * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that
+ * the CPU performance doesn't become 0.
+ */
+ cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1;
+}
+
void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy)
{
unsigned int cpu;

+ acpi_thermal_cpufreq_config();
+
for_each_cpu(cpu, policy->related_cpus) {
struct acpi_processor *pr = per_cpu(processors, cpu);
int ret;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index ba3f601b6e3d..407617670221 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1541,4 +1541,13 @@ static inline void acpi_device_notify(struct device *dev) { }
static inline void acpi_device_notify_remove(struct device *dev) { }
#endif

+#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
+int acpi_thermal_cpufreq_pctg(void);
+#else
+static inline int acpi_thermal_cpufreq_pctg(void)
+{
+ return 0;
+}
+#endif
+
#endif /*_LINUX_ACPI_H*/
--
2.17.1

2023-10-18 13:06:11

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Patch v5 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241

On Sat, Oct 14, 2023 at 12:55 PM Sumit Gupta <[email protected]> wrote:
>
> From: Srikar Srimath Tirumala <[email protected]>
>
> Current implementation of processor_thermal performs software throttling
> in fixed steps of "20%" which can be too coarse for some platforms.
> We observed some performance gain after reducing the throttle percentage.
> Change the CPUFREQ thermal reduction percentage and maximum thermal steps
> to be configurable. Also, update the default values of both for Nvidia
> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%"
> and accordingly the maximum number of thermal steps are increased as they
> are derived from the reduction percentage.
>
> Signed-off-by: Srikar Srimath Tirumala <[email protected]>
> Co-developed-by: Sumit Gupta <[email protected]>
> Signed-off-by: Sumit Gupta <[email protected]>
> ---
> drivers/acpi/arm64/Makefile | 1 +
> drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++++++
> drivers/acpi/processor_thermal.c | 35 +++++++++++++++++++++++++---
> include/linux/acpi.h | 9 +++++++
> 4 files changed, 62 insertions(+), 3 deletions(-)
> create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c
>
> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
> index 143debc1ba4a..3f181d8156cc 100644
> --- a/drivers/acpi/arm64/Makefile
> +++ b/drivers/acpi/arm64/Makefile
> @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
> obj-$(CONFIG_ACPI_APMT) += apmt.o
> obj-$(CONFIG_ARM_AMBA) += amba.o
> obj-y += dma.o init.o
> +obj-$(CONFIG_ACPI) += thermal_cpufreq.o
> diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c
> new file mode 100644
> index 000000000000..de834fb013e7
> --- /dev/null
> +++ b/drivers/acpi/arm64/thermal_cpufreq.c
> @@ -0,0 +1,20 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include <linux/acpi.h>
> +
> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
> +#define SMCCC_SOC_ID_T241 0x036b0241
> +
> +int acpi_thermal_cpufreq_pctg(void)
> +{
> + s32 soc_id = arm_smccc_get_soc_id_version();
> +
> + /*
> + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and
> + * reduce the CPUFREQ Thermal reduction percentage to 5%.
> + */
> + if (soc_id == SMCCC_SOC_ID_T241)
> + return 5;
> +
> + return 0;
> +}
> +#endif

This part needs an ACK from the ARM folks.

> diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c
> index b7c6287eccca..52f316e4e260 100644
> --- a/drivers/acpi/processor_thermal.c
> +++ b/drivers/acpi/processor_thermal.c
> @@ -26,7 +26,16 @@
> */
>
> #define CPUFREQ_THERMAL_MIN_STEP 0
> -#define CPUFREQ_THERMAL_MAX_STEP 3
> +
> +static int cpufreq_thermal_max_step __read_mostly = 3;
> +
> +/*
> + * Minimum throttle percentage for processor_thermal cooling device.
> + * The processor_thermal driver uses it to calculate the percentage amount by
> + * which cpu frequency must be reduced for each cooling state. This is also used
> + * to calculate the maximum number of throttling steps or cooling states.
> + */
> +static int cpufreq_thermal_pctg __read_mostly = 20;

I'd call this cpufreq_thermal_reduction_step, because the value
multiplied by it already is in percent.

>
> static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg);
>
> @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu)
> if (!cpu_has_cpufreq(cpu))
> return 0;
>
> - return CPUFREQ_THERMAL_MAX_STEP;
> + return cpufreq_thermal_max_step;
> }
>
> static int cpufreq_get_cur_state(unsigned int cpu)
> @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
> if (!policy)
> return -EINVAL;
>
> - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100;
> + max_freq = (policy->cpuinfo.max_freq *
> + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100;
>
> cpufreq_cpu_put(policy);
>
> @@ -126,10 +136,29 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
> return 0;
> }
>
> +static void acpi_thermal_cpufreq_config(void)
> +{
> + int cpufreq_pctg = acpi_thermal_cpufreq_pctg();
> +
> + if (!cpufreq_pctg)
> + return;
> +
> + cpufreq_thermal_pctg = cpufreq_pctg;
> +
> + /*
> + * Derive the MAX_STEP from minimum throttle percentage so that the reduction
> + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that
> + * the CPU performance doesn't become 0.
> + */
> + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1;

Why don't you use the local variable in the expression on the right-hand side?

Also please note that the formula doesn't allow the default
combination of reduction_step and max_step to be produced which is a
bit odd.

What would be wrong with max_step = 60 / reduction_step?

> +}
> +
> void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy)
> {
> unsigned int cpu;
>
> + acpi_thermal_cpufreq_config();
> +
> for_each_cpu(cpu, policy->related_cpus) {
> struct acpi_processor *pr = per_cpu(processors, cpu);
> int ret;
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index ba3f601b6e3d..407617670221 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1541,4 +1541,13 @@ static inline void acpi_device_notify(struct device *dev) { }
> static inline void acpi_device_notify_remove(struct device *dev) { }
> #endif
>
> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
> +int acpi_thermal_cpufreq_pctg(void);
> +#else
> +static inline int acpi_thermal_cpufreq_pctg(void)
> +{
> + return 0;
> +}
> +#endif
> +

This can go into drivers/acpi/internal.h as far as I'm concerned.

> #endif /*_LINUX_ACPI_H*/
> --

2023-10-20 08:31:24

by Sumit Gupta

[permalink] [raw]
Subject: Re: [Patch v5 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241


>> Current implementation of processor_thermal performs software throttling
>> in fixed steps of "20%" which can be too coarse for some platforms.
>> We observed some performance gain after reducing the throttle percentage.
>> Change the CPUFREQ thermal reduction percentage and maximum thermal steps
>> to be configurable. Also, update the default values of both for Nvidia
>> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%"
>> and accordingly the maximum number of thermal steps are increased as they
>> are derived from the reduction percentage.
>>
>> Signed-off-by: Srikar Srimath Tirumala <[email protected]>
>> Co-developed-by: Sumit Gupta <[email protected]>
>> Signed-off-by: Sumit Gupta <[email protected]>
>> ---
>> drivers/acpi/arm64/Makefile | 1 +
>> drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++++++
>> drivers/acpi/processor_thermal.c | 35 +++++++++++++++++++++++++---
>> include/linux/acpi.h | 9 +++++++
>> 4 files changed, 62 insertions(+), 3 deletions(-)
>> create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c
>>
>> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
>> index 143debc1ba4a..3f181d8156cc 100644
>> --- a/drivers/acpi/arm64/Makefile
>> +++ b/drivers/acpi/arm64/Makefile
>> @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
>> obj-$(CONFIG_ACPI_APMT) += apmt.o
>> obj-$(CONFIG_ARM_AMBA) += amba.o
>> obj-y += dma.o init.o
>> +obj-$(CONFIG_ACPI) += thermal_cpufreq.o
>> diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c
>> new file mode 100644
>> index 000000000000..de834fb013e7
>> --- /dev/null
>> +++ b/drivers/acpi/arm64/thermal_cpufreq.c
>> @@ -0,0 +1,20 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +#include <linux/acpi.h>
>> +
>> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
>> +#define SMCCC_SOC_ID_T241 0x036b0241
>> +
>> +int acpi_thermal_cpufreq_pctg(void)
>> +{
>> + s32 soc_id = arm_smccc_get_soc_id_version();
>> +
>> + /*
>> + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and
>> + * reduce the CPUFREQ Thermal reduction percentage to 5%.
>> + */
>> + if (soc_id == SMCCC_SOC_ID_T241)
>> + return 5;
>> +
>> + return 0;
>> +}
>> +#endif
>
> This part needs an ACK from the ARM folks.
>
Sorry, missed adding 'ACPI arm64' maintainers. Added Lorenzo, Sudeep and
Hanjun.

>> diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c
>> index b7c6287eccca..52f316e4e260 100644
>> --- a/drivers/acpi/processor_thermal.c
>> +++ b/drivers/acpi/processor_thermal.c
>> @@ -26,7 +26,16 @@
>> */
>>
>> #define CPUFREQ_THERMAL_MIN_STEP 0
>> -#define CPUFREQ_THERMAL_MAX_STEP 3
>> +
>> +static int cpufreq_thermal_max_step __read_mostly = 3;
>> +
>> +/*
>> + * Minimum throttle percentage for processor_thermal cooling device.
>> + * The processor_thermal driver uses it to calculate the percentage amount by
>> + * which cpu frequency must be reduced for each cooling state. This is also used
>> + * to calculate the maximum number of throttling steps or cooling states.
>> + */
>> +static int cpufreq_thermal_pctg __read_mostly = 20;
>
> I'd call this cpufreq_thermal_reduction_step, because the value
> multiplied by it already is in percent.
>

This is multiplied with reduction_pctg() which seem to be actually
giving the reduction_step and not percentage.

Can we rather rename the existing 'reduction_pctg' to 'reduction_step'
and 'cpufreq_thermal_pctg' to 'cpufreq_thermal_reduction_pctg' for more
clarity. Please suggest.

>>
>> static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg);
>>
>> @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu)
>> if (!cpu_has_cpufreq(cpu))
>> return 0;
>>
>> - return CPUFREQ_THERMAL_MAX_STEP;
>> + return cpufreq_thermal_max_step;
>> }
>>
>> static int cpufreq_get_cur_state(unsigned int cpu)
>> @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
>> if (!policy)
>> return -EINVAL;
>>
>> - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100;
>> + max_freq = (policy->cpuinfo.max_freq *
>> + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100;
>>
>> cpufreq_cpu_put(policy);
>>
>> @@ -126,10 +136,29 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state)
>> return 0;
>> }
>>
>> +static void acpi_thermal_cpufreq_config(void)
>> +{
>> + int cpufreq_pctg = acpi_thermal_cpufreq_pctg();
>> +
>> + if (!cpufreq_pctg)
>> + return;
>> +
>> + cpufreq_thermal_pctg = cpufreq_pctg;
>> +
>> + /*
>> + * Derive the MAX_STEP from minimum throttle percentage so that the reduction
>> + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that
>> + * the CPU performance doesn't become 0.
>> + */
>> + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1;
>
> Why don't you use the local variable in the expression on the right-hand side?
>
Ok.

> Also please note that the formula doesn't allow the default
> combination of reduction_step and max_step to be produced which is a
> bit odd.
>
> What would be wrong with max_step = 60 / reduction_step?
>

The new formula will be applied only to Tegra241 as this function
returns early for other SoC's. If we still want it to provide the
default value of max step ('3') if the default pctg ('20') is returned
by acpi_thermal_cpufreq_pctg(), then we can change to below.

cpufreq_thermal_max_step = (100 / cpufreq_thermal_reduction_pctg) - 2;

>> +}
>> +
>> void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy)
>> {
>> unsigned int cpu;
>>
>> + acpi_thermal_cpufreq_config();
>> +
>> for_each_cpu(cpu, policy->related_cpus) {
>> struct acpi_processor *pr = per_cpu(processors, cpu);
>> int ret;
>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>> index ba3f601b6e3d..407617670221 100644
>> --- a/include/linux/acpi.h
>> +++ b/include/linux/acpi.h
>> @@ -1541,4 +1541,13 @@ static inline void acpi_device_notify(struct device *dev) { }
>> static inline void acpi_device_notify_remove(struct device *dev) { }
>> #endif
>>
>> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
>> +int acpi_thermal_cpufreq_pctg(void);
>> +#else
>> +static inline int acpi_thermal_cpufreq_pctg(void)
>> +{
>> + return 0;
>> +}
>> +#endif
>> +
>
> This can go into drivers/acpi/internal.h as far as I'm concerned.
>

Ok. Will move this to 'internal.h' in v6.

>> #endif /*_LINUX_ACPI_H*/
>> --

2023-10-23 08:53:40

by Sudeep Holla

[permalink] [raw]
Subject: Re: [Patch v5 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241

On Sat, Oct 14, 2023 at 04:24:26PM +0530, Sumit Gupta wrote:
> From: Srikar Srimath Tirumala <[email protected]>
>
> Current implementation of processor_thermal performs software throttling
> in fixed steps of "20%" which can be too coarse for some platforms.
> We observed some performance gain after reducing the throttle percentage.
> Change the CPUFREQ thermal reduction percentage and maximum thermal steps
> to be configurable. Also, update the default values of both for Nvidia
> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%"
> and accordingly the maximum number of thermal steps are increased as they
> are derived from the reduction percentage.
>
> Signed-off-by: Srikar Srimath Tirumala <[email protected]>
> Co-developed-by: Sumit Gupta <[email protected]>
> Signed-off-by: Sumit Gupta <[email protected]>
> ---
> drivers/acpi/arm64/Makefile | 1 +
> drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++++++
> drivers/acpi/processor_thermal.c | 35 +++++++++++++++++++++++++---
> include/linux/acpi.h | 9 +++++++
> 4 files changed, 62 insertions(+), 3 deletions(-)
> create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c
>
> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
> index 143debc1ba4a..3f181d8156cc 100644
> --- a/drivers/acpi/arm64/Makefile
> +++ b/drivers/acpi/arm64/Makefile
> @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
> obj-$(CONFIG_ACPI_APMT) += apmt.o
> obj-$(CONFIG_ARM_AMBA) += amba.o
> obj-y += dma.o init.o
> +obj-$(CONFIG_ACPI) += thermal_cpufreq.o

Do we really need CONFIG_ACPI here ? We won't be building this if it
is not enabled.

If this is for some module building, then does it make sense to have
more specific config ? May be CONFIG_ACPI_THERMAL ?

> diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c
> new file mode 100644
> index 000000000000..de834fb013e7
> --- /dev/null
> +++ b/drivers/acpi/arm64/thermal_cpufreq.c
> @@ -0,0 +1,20 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include <linux/acpi.h>
> +
> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
> +#define SMCCC_SOC_ID_T241 0x036b0241
> +
> +int acpi_thermal_cpufreq_pctg(void)
> +{
> + s32 soc_id = arm_smccc_get_soc_id_version();
> +
> + /*
> + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and
> + * reduce the CPUFREQ Thermal reduction percentage to 5%.
> + */
> + if (soc_id == SMCCC_SOC_ID_T241)
> + return 5;
> +
> + return 0;
> +}
> +#endif

Since this looks like arch specific hook/callback, not sure if it is good
idea to have "arch_" in the function name. But if Rafael is OK with the name
I am fine with this as well.

--
Regards,
Sudeep

2023-10-25 12:52:00

by Sumit Gupta

[permalink] [raw]
Subject: Re: [Patch v5 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241



On 23/10/23 14:23, Sudeep Holla wrote:
> External email: Use caution opening links or attachments
>
>
> On Sat, Oct 14, 2023 at 04:24:26PM +0530, Sumit Gupta wrote:
>> From: Srikar Srimath Tirumala <[email protected]>
>>
>> Current implementation of processor_thermal performs software throttling
>> in fixed steps of "20%" which can be too coarse for some platforms.
>> We observed some performance gain after reducing the throttle percentage.
>> Change the CPUFREQ thermal reduction percentage and maximum thermal steps
>> to be configurable. Also, update the default values of both for Nvidia
>> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%"
>> and accordingly the maximum number of thermal steps are increased as they
>> are derived from the reduction percentage.
>>
>> Signed-off-by: Srikar Srimath Tirumala <[email protected]>
>> Co-developed-by: Sumit Gupta <[email protected]>
>> Signed-off-by: Sumit Gupta <[email protected]>
>> ---
>> drivers/acpi/arm64/Makefile | 1 +
>> drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++++++
>> drivers/acpi/processor_thermal.c | 35 +++++++++++++++++++++++++---
>> include/linux/acpi.h | 9 +++++++
>> 4 files changed, 62 insertions(+), 3 deletions(-)
>> create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c
>>
>> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
>> index 143debc1ba4a..3f181d8156cc 100644
>> --- a/drivers/acpi/arm64/Makefile
>> +++ b/drivers/acpi/arm64/Makefile
>> @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
>> obj-$(CONFIG_ACPI_APMT) += apmt.o
>> obj-$(CONFIG_ARM_AMBA) += amba.o
>> obj-y += dma.o init.o
>> +obj-$(CONFIG_ACPI) += thermal_cpufreq.o
>
> Do we really need CONFIG_ACPI here ? We won't be building this if it
> is not enabled.
>

I think we can remove the CONFIG_ACPI macro here and enable it by default.

> If this is for some module building, then does it make sense to have
> more specific config ? May be CONFIG_ACPI_THERMAL ?
>
>> diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c
>> new file mode 100644
>> index 000000000000..de834fb013e7
>> --- /dev/null
>> +++ b/drivers/acpi/arm64/thermal_cpufreq.c
>> @@ -0,0 +1,20 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +#include <linux/acpi.h>
>> +
>> +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
>> +#define SMCCC_SOC_ID_T241 0x036b0241
>> +
>> +int acpi_thermal_cpufreq_pctg(void)
>> +{
>> + s32 soc_id = arm_smccc_get_soc_id_version();
>> +
>> + /*
>> + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and
>> + * reduce the CPUFREQ Thermal reduction percentage to 5%.
>> + */
>> + if (soc_id == SMCCC_SOC_ID_T241)
>> + return 5;
>> +
>> + return 0;
>> +}
>> +#endif
>
> Since this looks like arch specific hook/callback, not sure if it is good
> idea to have "arch_" in the function name. But if Rafael is OK with the name
> I am fine with this as well.
>
> --
> Regards,
> Sudeep

Will change the name from acpi_thermal_cpufreq_* to
acpi_arch_thermal_cpufreq_* if this suits more.

Best Regards,
Sumit Gupta