2022-02-14 21:10:59

by Alyssa Rosenzweig

[permalink] [raw]
Subject: [PATCH] drm/panfrost: Dynamically allocate pm_domains

MT8192 requires 5 power domains. Rather than bump MAX_PM_DOMAINS and
waste memory on every supported Panfrost chip, instead dynamically
allocate pm_domain_devs and pm_domain_links. This adds some flexibility;
it seems inevitable a new MediaTek device will require more than 5
domains.

On non-MediaTek devices, this saves a small amount of memory.

Suggested-by: AngeloGioacchino Del Regno <[email protected]>
Signed-off-by: Alyssa Rosenzweig <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_device.c | 14 ++++++++++----
drivers/gpu/drm/panfrost/panfrost_device.h | 5 ++---
2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
index ee612303f076..661cdec320af 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -127,7 +127,10 @@ static void panfrost_pm_domain_fini(struct panfrost_device *pfdev)
{
int i;

- for (i = 0; i < ARRAY_SIZE(pfdev->pm_domain_devs); i++) {
+ if (!pfdev->pm_domain_devs || !pfdev->pm_domain_links)
+ return;
+
+ for (i = 0; i < pfdev->comp->num_pm_domains; i++) {
if (!pfdev->pm_domain_devs[i])
break;

@@ -161,9 +164,12 @@ static int panfrost_pm_domain_init(struct panfrost_device *pfdev)
return -EINVAL;
}

- if (WARN(num_domains > ARRAY_SIZE(pfdev->pm_domain_devs),
- "Too many supplies in compatible structure.\n"))
- return -EINVAL;
+ pfdev->pm_domain_devs = devm_kcalloc(pfdev->dev, num_domains,
+ sizeof(*pfdev->pm_domain_devs),
+ GFP_KERNEL);
+ pfdev->pm_domain_links = devm_kcalloc(pfdev->dev, num_domains,
+ sizeof(*pfdev->pm_domain_links),
+ GFP_KERNEL);

for (i = 0; i < num_domains; i++) {
pfdev->pm_domain_devs[i] =
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index 8b25278f34c8..98e3039696f9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -22,7 +22,6 @@ struct panfrost_job;
struct panfrost_perfcnt;

#define NUM_JOB_SLOTS 3
-#define MAX_PM_DOMAINS 3

struct panfrost_features {
u16 id;
@@ -87,8 +86,8 @@ struct panfrost_device {
struct regulator_bulk_data *regulators;
struct reset_control *rstc;
/* pm_domains for devices with more than one. */
- struct device *pm_domain_devs[MAX_PM_DOMAINS];
- struct device_link *pm_domain_links[MAX_PM_DOMAINS];
+ struct device **pm_domain_devs;
+ struct device_link **pm_domain_links;
bool coherent;

struct panfrost_features features;
--
2.34.1


2022-02-14 21:35:48

by Alyssa Rosenzweig

[permalink] [raw]
Subject: Re: [PATCH] drm/panfrost: Dynamically allocate pm_domains

mali_kbase hardcodes MAX_PM_DOMAINS (=5 for the mt8192 kernel). I have
no real objection to it but Angelo did. Maybe should've marked this RFC.

On Mon, Feb 14, 2022 at 03:31:32PM -0500, Alyssa Rosenzweig wrote:
> MT8192 requires 5 power domains. Rather than bump MAX_PM_DOMAINS and
> waste memory on every supported Panfrost chip, instead dynamically
> allocate pm_domain_devs and pm_domain_links. This adds some flexibility;
> it seems inevitable a new MediaTek device will require more than 5
> domains.
>
> On non-MediaTek devices, this saves a small amount of memory.
>
> Suggested-by: AngeloGioacchino Del Regno <[email protected]>
> Signed-off-by: Alyssa Rosenzweig <[email protected]>
> ---
> drivers/gpu/drm/panfrost/panfrost_device.c | 14 ++++++++++----
> drivers/gpu/drm/panfrost/panfrost_device.h | 5 ++---
> 2 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
> index ee612303f076..661cdec320af 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -127,7 +127,10 @@ static void panfrost_pm_domain_fini(struct panfrost_device *pfdev)
> {
> int i;
>
> - for (i = 0; i < ARRAY_SIZE(pfdev->pm_domain_devs); i++) {
> + if (!pfdev->pm_domain_devs || !pfdev->pm_domain_links)
> + return;
> +
> + for (i = 0; i < pfdev->comp->num_pm_domains; i++) {
> if (!pfdev->pm_domain_devs[i])
> break;
>
> @@ -161,9 +164,12 @@ static int panfrost_pm_domain_init(struct panfrost_device *pfdev)
> return -EINVAL;
> }
>
> - if (WARN(num_domains > ARRAY_SIZE(pfdev->pm_domain_devs),
> - "Too many supplies in compatible structure.\n"))
> - return -EINVAL;
> + pfdev->pm_domain_devs = devm_kcalloc(pfdev->dev, num_domains,
> + sizeof(*pfdev->pm_domain_devs),
> + GFP_KERNEL);
> + pfdev->pm_domain_links = devm_kcalloc(pfdev->dev, num_domains,
> + sizeof(*pfdev->pm_domain_links),
> + GFP_KERNEL);
>
> for (i = 0; i < num_domains; i++) {
> pfdev->pm_domain_devs[i] =
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 8b25278f34c8..98e3039696f9 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -22,7 +22,6 @@ struct panfrost_job;
> struct panfrost_perfcnt;
>
> #define NUM_JOB_SLOTS 3
> -#define MAX_PM_DOMAINS 3
>
> struct panfrost_features {
> u16 id;
> @@ -87,8 +86,8 @@ struct panfrost_device {
> struct regulator_bulk_data *regulators;
> struct reset_control *rstc;
> /* pm_domains for devices with more than one. */
> - struct device *pm_domain_devs[MAX_PM_DOMAINS];
> - struct device_link *pm_domain_links[MAX_PM_DOMAINS];
> + struct device **pm_domain_devs;
> + struct device_link **pm_domain_links;
> bool coherent;
>
> struct panfrost_features features;
> --
> 2.34.1
>

Subject: Re: [PATCH] drm/panfrost: Dynamically allocate pm_domains

Il 14/02/22 21:55, Alyssa Rosenzweig ha scritto:
> mali_kbase hardcodes MAX_PM_DOMAINS (=5 for the mt8192 kernel). I have
> no real objection to it but Angelo did. Maybe should've marked this RFC.

Clarifying, the suggested patch was not a big objection, but I think that it
would be a nice preventive cleanup that is useful for the power tree that has
to be managed on MT8192 and possibly on other SoCs.

I would expect to see a variable amount of PM domains to take care of as time
goes by (with new SoCs, not necessarily only MediaTek) due to granular power
optimizations but, at the same time, dynamically allocating the pm_domain_devs
and links structures makes this driver to also be nice with older platforms,
where memory is a little more constrained, allowing Linux to have a lighter
footprint, even if not by much.

Logic for this footprint saving is "a little here, a little there, becomes a
bit more considerable" (of course, being aware of both upsides and downsides
in dynamically allocating things, and avoiding to write gigabytes of text to
explain common knowledge).

P.S.: Thank you all!

Regards,
Angelo

>
> On Mon, Feb 14, 2022 at 03:31:32PM -0500, Alyssa Rosenzweig wrote:
>> MT8192 requires 5 power domains. Rather than bump MAX_PM_DOMAINS and
>> waste memory on every supported Panfrost chip, instead dynamically
>> allocate pm_domain_devs and pm_domain_links. This adds some flexibility;
>> it seems inevitable a new MediaTek device will require more than 5
>> domains.
>>
>> On non-MediaTek devices, this saves a small amount of memory.
>>
>> Suggested-by: AngeloGioacchino Del Regno <[email protected]>
>> Signed-off-by: Alyssa Rosenzweig <[email protected]>
>> ---
>> drivers/gpu/drm/panfrost/panfrost_device.c | 14 ++++++++++----
>> drivers/gpu/drm/panfrost/panfrost_device.h | 5 ++---
>> 2 files changed, 12 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
>> index ee612303f076..661cdec320af 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
>> @@ -127,7 +127,10 @@ static void panfrost_pm_domain_fini(struct panfrost_device *pfdev)
>> {
>> int i;
>>
>> - for (i = 0; i < ARRAY_SIZE(pfdev->pm_domain_devs); i++) {
>> + if (!pfdev->pm_domain_devs || !pfdev->pm_domain_links)
>> + return;
>> +
>> + for (i = 0; i < pfdev->comp->num_pm_domains; i++) {
>> if (!pfdev->pm_domain_devs[i])
>> break;
>>
>> @@ -161,9 +164,12 @@ static int panfrost_pm_domain_init(struct panfrost_device *pfdev)
>> return -EINVAL;
>> }
>>
>> - if (WARN(num_domains > ARRAY_SIZE(pfdev->pm_domain_devs),
>> - "Too many supplies in compatible structure.\n"))
>> - return -EINVAL;
>> + pfdev->pm_domain_devs = devm_kcalloc(pfdev->dev, num_domains,
>> + sizeof(*pfdev->pm_domain_devs),
>> + GFP_KERNEL);
>> + pfdev->pm_domain_links = devm_kcalloc(pfdev->dev, num_domains,
>> + sizeof(*pfdev->pm_domain_links),
>> + GFP_KERNEL);
>>
>> for (i = 0; i < num_domains; i++) {
>> pfdev->pm_domain_devs[i] =
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>> index 8b25278f34c8..98e3039696f9 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>> @@ -22,7 +22,6 @@ struct panfrost_job;
>> struct panfrost_perfcnt;
>>
>> #define NUM_JOB_SLOTS 3
>> -#define MAX_PM_DOMAINS 3
>>
>> struct panfrost_features {
>> u16 id;
>> @@ -87,8 +86,8 @@ struct panfrost_device {
>> struct regulator_bulk_data *regulators;
>> struct reset_control *rstc;
>> /* pm_domains for devices with more than one. */
>> - struct device *pm_domain_devs[MAX_PM_DOMAINS];
>> - struct device_link *pm_domain_links[MAX_PM_DOMAINS];
>> + struct device **pm_domain_devs;
>> + struct device_link **pm_domain_links;
>> bool coherent;
>>
>> struct panfrost_features features;
>> --
>> 2.34.1
>>

2022-02-15 13:12:12

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] drm/panfrost: Dynamically allocate pm_domains

On 2022-02-14 20:31, Alyssa Rosenzweig wrote:
> MT8192 requires 5 power domains. Rather than bump MAX_PM_DOMAINS and
> waste memory on every supported Panfrost chip, instead dynamically
> allocate pm_domain_devs and pm_domain_links. This adds some flexibility;
> it seems inevitable a new MediaTek device will require more than 5
> domains.
>
> On non-MediaTek devices, this saves a small amount of memory.
>
> Suggested-by: AngeloGioacchino Del Regno <[email protected]>
> Signed-off-by: Alyssa Rosenzweig <[email protected]>
> ---
> drivers/gpu/drm/panfrost/panfrost_device.c | 14 ++++++++++----
> drivers/gpu/drm/panfrost/panfrost_device.h | 5 ++---
> 2 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
> index ee612303f076..661cdec320af 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -127,7 +127,10 @@ static void panfrost_pm_domain_fini(struct panfrost_device *pfdev)
> {
> int i;
>
> - for (i = 0; i < ARRAY_SIZE(pfdev->pm_domain_devs); i++) {
> + if (!pfdev->pm_domain_devs || !pfdev->pm_domain_links)
> + return;
> +
> + for (i = 0; i < pfdev->comp->num_pm_domains; i++) {
> if (!pfdev->pm_domain_devs[i])
> break;
>
> @@ -161,9 +164,12 @@ static int panfrost_pm_domain_init(struct panfrost_device *pfdev)
> return -EINVAL;
> }
>
> - if (WARN(num_domains > ARRAY_SIZE(pfdev->pm_domain_devs),
> - "Too many supplies in compatible structure.\n"))
> - return -EINVAL;
> + pfdev->pm_domain_devs = devm_kcalloc(pfdev->dev, num_domains,
> + sizeof(*pfdev->pm_domain_devs),
> + GFP_KERNEL);
> + pfdev->pm_domain_links = devm_kcalloc(pfdev->dev, num_domains,
> + sizeof(*pfdev->pm_domain_links),
> + GFP_KERNEL);

Since we're not really doing any detailed management of our device
links, could we get away with using AUTOREMOVE_CONSUMER instead of
STATELESS to avoid having to explicitly keep track of them ourselves?

Robin.

>
> for (i = 0; i < num_domains; i++) {
> pfdev->pm_domain_devs[i] =
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 8b25278f34c8..98e3039696f9 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -22,7 +22,6 @@ struct panfrost_job;
> struct panfrost_perfcnt;
>
> #define NUM_JOB_SLOTS 3
> -#define MAX_PM_DOMAINS 3
>
> struct panfrost_features {
> u16 id;
> @@ -87,8 +86,8 @@ struct panfrost_device {
> struct regulator_bulk_data *regulators;
> struct reset_control *rstc;
> /* pm_domains for devices with more than one. */
> - struct device *pm_domain_devs[MAX_PM_DOMAINS];
> - struct device_link *pm_domain_links[MAX_PM_DOMAINS];
> + struct device **pm_domain_devs;
> + struct device_link **pm_domain_links;
> bool coherent;
>
> struct panfrost_features features;

2022-02-16 06:45:57

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH] drm/panfrost: Dynamically allocate pm_domains

On Mon, Feb 14, 2022 at 2:31 PM Alyssa Rosenzweig
<[email protected]> wrote:
>
> MT8192 requires 5 power domains. Rather than bump MAX_PM_DOMAINS and
> waste memory on every supported Panfrost chip, instead dynamically
> allocate pm_domain_devs and pm_domain_links. This adds some flexibility;
> it seems inevitable a new MediaTek device will require more than 5
> domains.
>
> On non-MediaTek devices, this saves a small amount of memory.

How much? You measured it?

It's not that simple. kmalloc has finite allocation sizes (see
/proc/slabinfo). So unless panfrost_device shrinks or grows to the
next smaller or larger size, the memory used doesn't change. And each
devm_kmalloc adds its own overhead as well.

I'd do the oneliner changing it to 5 and be done with it. That being
said, we have plenty of examples of doing this both ways, so whatever
makes people happy.

Rob

2022-02-16 07:33:14

by Alyssa Rosenzweig

[permalink] [raw]
Subject: Re: [PATCH] drm/panfrost: Dynamically allocate pm_domains

> I'd do the oneliner changing it to 5 and be done with it. That being
> said, we have plenty of examples of doing this both ways, so whatever
> makes people happy.

Excellent, that's the patch I wrote originally :-)

Dropping this patch, unless Angelo (or someone else) strongly objects.

Subject: Re: [PATCH] drm/panfrost: Dynamically allocate pm_domains

Il 15/02/22 23:32, Alyssa Rosenzweig ha scritto:
>> I'd do the oneliner changing it to 5 and be done with it. That being
>> said, we have plenty of examples of doing this both ways, so whatever
>> makes people happy.
>
> Excellent, that's the patch I wrote originally :-)
>
> Dropping this patch, unless Angelo (or someone else) strongly objects.


Concretely measuring would be great, but I have no strong objections about
perhaps delaying this research to another (near or far) moment.
Let's drop this for now.