2023-09-13 13:18:27

by Shrikanth Hegde

[permalink] [raw]
Subject: Re: [PATCH v2] sched/topology: remove sysctl_sched_energy_aware depending on the architecture



On 9/7/23 4:04 PM, Pierre Gondois wrote:
> Hello Shrikanth,
>
> On 9/6/23 13:35, Shrikanth Hegde wrote:
>>
>>
>> On 9/5/23 7:33 PM, Pierre Gondois wrote:
>>> Hello Shrikanth,
>>> I tried the patch (on a platform using the cppc_cpufreq driver). The
>>> platform
>>> normally has EAS enabled, but the patch removed the sched_energy_aware
>>> sysctl.
>>> It seemed the following happened (in the below order):
>>>
>>> 1. sched_energy_aware_sysctl_init()
>>> Doesn't set sysctl_sched_energy_aware as cpufreq_freq_invariance
>>> isn't set
>>> and arch_scale_freq_invariant() returns false
>>>
>>> 2. cpufreq_register_driver()
>>> Sets cpufreq_freq_invariance during cpufreq initialization
>>> sched_energy_set()
>>>
>>> 3. sched_energy_set()
>>> Is called with has_eas=0 since build_perf_domains() doesn't see the
>>> platform
>>> as EAS compatible. Indeed sysctl_sched_energy_aware=0.
>>> So with sysctl_sched_energy_aware=0 and has_eas=0, sched_energy_aware
>>> sysctl
>>> is not enabled even though EAS should be possible.
>>>
>>>
>>> On 9/1/23 08:52, Shrikanth Hegde wrote:
>>>> Currently sysctl_sched_energy_aware doesn't alter the said behaviour on
>>>> some of the architectures. IIUC its meant to either force rebuild the
>>>> perf domains or cleanup the perf domains by echoing 1 or 0
>>>> respectively.
>>>
>>> There is a definition of the sysctl at:
>>> Documentation/admin-guide/sysctl/kernel.rst::sched_energy_aware
>> [...]
>>>>
>>>>
>>>> +static unsigned int sysctl_sched_energy_aware;
>>>> +static struct ctl_table_header *sysctl_eas_header;
>>>
>>> The variables around the presence/absence of EAS are:
>>> - sched_energy_present:
>>> EAS is up and running
>>>
>>> - sysctl_sched_energy_aware:
>>> The user wants to use EAS (or not). Doesn't mean EAS can run on the
>>> platform.
>>>
>>> - sched_energy_set/partition_sched_domains_locked's "has_eas":
>>> Local variable. Represent whether EAS can run on the platform.
>>>
>>> IMO it would be simpler to (un)register sched_energy_aware sysctl
>>> in partition_sched_domains_locked(), based on the value of "has_eas".
>>> This would allow to let all the logic as it is right now, inside
>>> build_perf_domains(), and then advertise sched_energy_aware sysctl
>>> if EAS can run on the platform.
>>> sched_energy_aware_sysctl_init() would be deleted then.
>>>
>>>
>> yes. that is true. and there is no variable which holds the info if
>> the system
>> is capable of EAS.
>>
>> Retrospecting, the reason for starting this patch series was this,
>> sysctl_sched_energy_aware didnt make sense on power10 platform since it
>> has SMT and symmetric CPU capacities.  with current code writing 1 to
>> it cause rebuild of sched domains but EAS wouldn't be possible.
>>
>>
>> Possible Approaches:
>>
>> 1.
>> Make this sysctl write as NOP if the platform doesn't has EAS
>> capabilities at
>> the moment.  Do those checks in sched_energy_aware_handler before
>> handling  the
>> change in value. Return EINVAL.  And Update sysctl description that on
>> such
>> platforms value change is NOP. Relatively simpler change.
>>
>> 2.
>> Current patch approach, remove the sysctl completely on non supported
>> architectures and re-enable it if the system becomes capable of doing
>> EAS.
>> With the current patch, instead of using sched_energy_update, use another
>> variable called sched_energy_change_in_sysctl(maybe different name). 
>> I think
>> that would handle all the cases. Another variable can be avoided by
>> encoding
>> the info in sysctl_sched_energy_aware itself in the handler call,
>> since it
>> takes only 1 or 0 as the value. upper bits are free to use. update the
>> sysctl
>> as well with this behavior. plus minor cleanup to remove the init of
>> sysctl.
>>
>
> FWIW I think it makes more sense to remove the sysctl if EAS isn't
> possible on
> the platform, as this patch intends to do.
> From a code perspective I m not sure I follow exactly your intend. I can
> test
> your v3 if necessary,
>

Hi Pierre,

Just sent out v3 of the series.
https://lore.kernel.org/lkml/[email protected]/

It would really help if you can test this out. I don't have system where EAS can be
enabled at the moment.

> Regards,
> Pierre
>