From: Jordan Crouse <[email protected]>
Try to get the interconnect path for the GPU and vote for the maximum
bandwidth to support all frequencies. This is needed for performance.
Later we will want to scale the bandwidth based on the frequency to
also optimize for power but that will require some device tree
infrastructure that does not yet exist.
v6: use icc_set_bw() instead of icc_set()
v5: Remove hardcoded interconnect name and just use the default
v4: Don't use a port string at all to skip the need for names in the DT
v3: Use macros and change port string per Georgi Djakov
Signed-off-by: Jordan Crouse <[email protected]>
Acked-by: Rob Clark <[email protected]>
Reviewed-by: Evan Green <[email protected]>
Signed-off-by: Georgi Djakov <[email protected]>
---
Hi Greg,
If not too late, could you please take this patch into char-misc-next.
It is adding the first consumer of the interconnect API. We are just
getting the code in place, without making it functional yet, as some
DT bits are still needed to actually enable it. We have Rob's Ack to
merge this together with the interconnect code. This patch has already
spent some time in linux-next without any issues.
Thanks,
Georgi
drivers/gpu/drm/msm/Kconfig | 1 +
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 20 ++++++++++++++++++++
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 9 +++++++++
drivers/gpu/drm/msm/msm_gpu.h | 3 +++
4 files changed, 33 insertions(+)
diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index cf549f1ed403..78c9e5a5e793 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -5,6 +5,7 @@ config DRM_MSM
depends on ARCH_QCOM || SOC_IMX5 || (ARM && COMPILE_TEST)
depends on OF && COMMON_CLK
depends on MMU
+ depends on INTERCONNECT || !INTERCONNECT
select QCOM_MDT_LOADER if ARCH_QCOM
select REGULATOR
select DRM_KMS_HELPER
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index ce1b3cc4bf6d..d1662a75c7ec 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -2,6 +2,7 @@
/* Copyright (c) 2017-2018 The Linux Foundation. All rights reserved. */
#include <linux/clk.h>
+#include <linux/interconnect.h>
#include <linux/pm_opp.h>
#include <soc/qcom/cmd-db.h>
@@ -84,6 +85,9 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu)
static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index)
{
+ struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
+ struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+ struct msm_gpu *gpu = &adreno_gpu->base;
int ret;
gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);
@@ -106,6 +110,12 @@ static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index)
dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret);
gmu->freq = gmu->gpu_freqs[index];
+
+ /*
+ * Eventually we will want to scale the path vote with the frequency but
+ * for now leave it at max so that the performance is nominal.
+ */
+ icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216));
}
void a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq)
@@ -705,6 +715,8 @@ int a6xx_gmu_reset(struct a6xx_gpu *a6xx_gpu)
int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
{
+ struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+ struct msm_gpu *gpu = &adreno_gpu->base;
struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
int status, ret;
@@ -720,6 +732,9 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu)
if (ret)
goto out;
+ /* Set the bus quota to a reasonable value for boot */
+ icc_set_bw(gpu->icc_path, 0, MBps_to_icc(3072));
+
a6xx_gmu_irq_enable(gmu);
/* Check to see if we are doing a cold or warm boot */
@@ -760,6 +775,8 @@ bool a6xx_gmu_isidle(struct a6xx_gmu *gmu)
int a6xx_gmu_stop(struct a6xx_gpu *a6xx_gpu)
{
+ struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+ struct msm_gpu *gpu = &adreno_gpu->base;
struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
u32 val;
@@ -806,6 +823,9 @@ int a6xx_gmu_stop(struct a6xx_gpu *a6xx_gpu)
/* Tell RPMh to power off the GPU */
a6xx_rpmh_stop(gmu);
+ /* Remove the bus vote */
+ icc_set_bw(gpu->icc_path, 0, 0);
+
clk_bulk_disable_unprepare(gmu->nr_clocks, gmu->clocks);
pm_runtime_put_sync(gmu->dev);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 2cfee1a4fe0b..27898475cdf4 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -18,6 +18,7 @@
*/
#include <linux/ascii85.h>
+#include <linux/interconnect.h>
#include <linux/kernel.h>
#include <linux/pm_opp.h>
#include <linux/slab.h>
@@ -747,6 +748,11 @@ static int adreno_get_pwrlevels(struct device *dev,
DBG("fast_rate=%u, slow_rate=27000000", gpu->fast_rate);
+ /* Check for an interconnect path for the bus */
+ gpu->icc_path = of_icc_get(dev, NULL);
+ if (IS_ERR(gpu->icc_path))
+ gpu->icc_path = NULL;
+
return 0;
}
@@ -787,10 +793,13 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
void adreno_gpu_cleanup(struct adreno_gpu *adreno_gpu)
{
+ struct msm_gpu *gpu = &adreno_gpu->base;
unsigned int i;
for (i = 0; i < ARRAY_SIZE(adreno_gpu->info->fw); i++)
release_firmware(adreno_gpu->fw[i]);
+ icc_put(gpu->icc_path);
+
msm_gpu_cleanup(&adreno_gpu->base);
}
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index ca17086f72c9..6241986bab51 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -19,6 +19,7 @@
#define __MSM_GPU_H__
#include <linux/clk.h>
+#include <linux/interconnect.h>
#include <linux/regulator/consumer.h>
#include "msm_drv.h"
@@ -118,6 +119,8 @@ struct msm_gpu {
struct clk *ebi1_clk, *core_clk, *rbbmtimer_clk;
uint32_t fast_rate;
+ struct icc_path *icc_path;
+
/* Hang and Inactivity Detection:
*/
#define DRM_MSM_INACTIVE_PERIOD 66 /* in ms (roughly four frames) */
On Tue, Feb 12, 2019 at 11:52:38AM +0200, Georgi Djakov wrote:
> From: Jordan Crouse <[email protected]>
>
> Try to get the interconnect path for the GPU and vote for the maximum
> bandwidth to support all frequencies. This is needed for performance.
> Later we will want to scale the bandwidth based on the frequency to
> also optimize for power but that will require some device tree
> infrastructure that does not yet exist.
>
> v6: use icc_set_bw() instead of icc_set()
> v5: Remove hardcoded interconnect name and just use the default
> v4: Don't use a port string at all to skip the need for names in the DT
> v3: Use macros and change port string per Georgi Djakov
>
> Signed-off-by: Jordan Crouse <[email protected]>
> Acked-by: Rob Clark <[email protected]>
> Reviewed-by: Evan Green <[email protected]>
> Signed-off-by: Georgi Djakov <[email protected]>
> ---
>
> Hi Greg,
>
> If not too late, could you please take this patch into char-misc-next.
> It is adding the first consumer of the interconnect API. We are just
> getting the code in place, without making it functional yet, as some
> DT bits are still needed to actually enable it. We have Rob's Ack to
> merge this together with the interconnect code. This patch has already
> spent some time in linux-next without any issues.
I have a question about the interconnect code. Last week I saw a
presentation about the resctrl/RDT code from ARM that is coming (MPAM),
and it really looks like the same functionality as this interconnect
code. In fact, this code looks like the existing resctrl stuff, right?
So why shouldn't we just drop the interconnect code and use resctrl
instead as it's already merged?
thanks,
greg k-h
Hi Greg,
On 2/12/19 12:16, Greg KH wrote:
> On Tue, Feb 12, 2019 at 11:52:38AM +0200, Georgi Djakov wrote:
>> From: Jordan Crouse <[email protected]>
>>
>> Try to get the interconnect path for the GPU and vote for the maximum
>> bandwidth to support all frequencies. This is needed for performance.
>> Later we will want to scale the bandwidth based on the frequency to
>> also optimize for power but that will require some device tree
>> infrastructure that does not yet exist.
>>
>> v6: use icc_set_bw() instead of icc_set()
>> v5: Remove hardcoded interconnect name and just use the default
>> v4: Don't use a port string at all to skip the need for names in the DT
>> v3: Use macros and change port string per Georgi Djakov
>>
>> Signed-off-by: Jordan Crouse <[email protected]>
>> Acked-by: Rob Clark <[email protected]>
>> Reviewed-by: Evan Green <[email protected]>
>> Signed-off-by: Georgi Djakov <[email protected]>
>> ---
>>
>> Hi Greg,
>>
>> If not too late, could you please take this patch into char-misc-next.
>> It is adding the first consumer of the interconnect API. We are just
>> getting the code in place, without making it functional yet, as some
>> DT bits are still needed to actually enable it. We have Rob's Ack to
>> merge this together with the interconnect code. This patch has already
>> spent some time in linux-next without any issues.
>
> I have a question about the interconnect code. Last week I saw a
> presentation about the resctrl/RDT code from ARM that is coming (MPAM),
> and it really looks like the same functionality as this interconnect
> code. In fact, this code looks like the existing resctrl stuff, right?
Thanks for the question! It's nice that MPAM is moving forward. When i
looked into the MPAM draft spec an year ago, it was an optional
extension mentioning mostly use-cases with VMs on server systems.
But anyway, MPAM is only available for ARMv8.2+ cores as an optional
extension and aarch32 is not supported. In contrast to that, the
interconnect code is generic and does not put any limitations on the
platform/architecture that can use it - just the platform specific
implementation would be different. We have discussed in that past that
it can be used even on x86 platforms to provide hints to firmware.
> So why shouldn't we just drop the interconnect code and use resctrl
> instead as it's already merged?
I haven't seen any MPAM code so far, but i assume that we can have an
interconnect provider that implements this MPAM extension for systems
that support it (and want to use it). Currently there are people working
on various interconnect platform drivers from 5 different SoC vendors
and we have agreed to use a common DT bindings (and API). I doubt that
even a single one of these platforms is based on v8.2+. Probably such
SoCs would be coming in the future and then i expect people making use
of MPAM in some interconnect provider driver.
Thanks,
Georgi
On Tue, Feb 12, 2019 at 04:07:35PM +0200, Georgi Djakov wrote:
> Hi Greg,
>
> On 2/12/19 12:16, Greg KH wrote:
> > On Tue, Feb 12, 2019 at 11:52:38AM +0200, Georgi Djakov wrote:
> >> From: Jordan Crouse <[email protected]>
> >>
> >> Try to get the interconnect path for the GPU and vote for the maximum
> >> bandwidth to support all frequencies. This is needed for performance.
> >> Later we will want to scale the bandwidth based on the frequency to
> >> also optimize for power but that will require some device tree
> >> infrastructure that does not yet exist.
> >>
> >> v6: use icc_set_bw() instead of icc_set()
> >> v5: Remove hardcoded interconnect name and just use the default
> >> v4: Don't use a port string at all to skip the need for names in the DT
> >> v3: Use macros and change port string per Georgi Djakov
> >>
> >> Signed-off-by: Jordan Crouse <[email protected]>
> >> Acked-by: Rob Clark <[email protected]>
> >> Reviewed-by: Evan Green <[email protected]>
> >> Signed-off-by: Georgi Djakov <[email protected]>
> >> ---
> >>
> >> Hi Greg,
> >>
> >> If not too late, could you please take this patch into char-misc-next.
> >> It is adding the first consumer of the interconnect API. We are just
> >> getting the code in place, without making it functional yet, as some
> >> DT bits are still needed to actually enable it. We have Rob's Ack to
> >> merge this together with the interconnect code. This patch has already
> >> spent some time in linux-next without any issues.
> >
> > I have a question about the interconnect code. Last week I saw a
> > presentation about the resctrl/RDT code from ARM that is coming (MPAM),
> > and it really looks like the same functionality as this interconnect
> > code. In fact, this code looks like the existing resctrl stuff, right?
>
> Thanks for the question! It's nice that MPAM is moving forward. When i
> looked into the MPAM draft spec an year ago, it was an optional
> extension mentioning mostly use-cases with VMs on server systems.
>
> But anyway, MPAM is only available for ARMv8.2+ cores as an optional
> extension and aarch32 is not supported. In contrast to that, the
> interconnect code is generic and does not put any limitations on the
> platform/architecture that can use it - just the platform specific
> implementation would be different. We have discussed in that past that
> it can be used even on x86 platforms to provide hints to firmware.
Yes, but resctrl is arch independant. It's not the "backend" that I'm
concerned about, it's the userspace and in-kernel api that I worry
about.
> > So why shouldn't we just drop the interconnect code and use resctrl
> > instead as it's already merged?
>
> I haven't seen any MPAM code so far, but i assume that we can have an
> interconnect provider that implements this MPAM extension for systems
> that support it (and want to use it). Currently there are people working
> on various interconnect platform drivers from 5 different SoC vendors
> and we have agreed to use a common DT bindings (and API). I doubt that
> even a single one of these platforms is based on v8.2+. Probably such
> SoCs would be coming in the future and then i expect people making use
> of MPAM in some interconnect provider driver.
Again, don't focus on MPAM as-is, it's the resctrl api that I would like
to see explained why interconnect can't use.
thanks,
greg k-h
Hi,
On 2/12/19 16:35, Greg KH wrote:
> On Tue, Feb 12, 2019 at 04:07:35PM +0200, Georgi Djakov wrote:
>> Hi Greg,
>>
>> On 2/12/19 12:16, Greg KH wrote:
>>> On Tue, Feb 12, 2019 at 11:52:38AM +0200, Georgi Djakov wrote:
>>>> From: Jordan Crouse <[email protected]>
>>>>
>>>> Try to get the interconnect path for the GPU and vote for the maximum
>>>> bandwidth to support all frequencies. This is needed for performance.
>>>> Later we will want to scale the bandwidth based on the frequency to
>>>> also optimize for power but that will require some device tree
>>>> infrastructure that does not yet exist.
>>>>
>>>> v6: use icc_set_bw() instead of icc_set()
>>>> v5: Remove hardcoded interconnect name and just use the default
>>>> v4: Don't use a port string at all to skip the need for names in the DT
>>>> v3: Use macros and change port string per Georgi Djakov
>>>>
>>>> Signed-off-by: Jordan Crouse <[email protected]>
>>>> Acked-by: Rob Clark <[email protected]>
>>>> Reviewed-by: Evan Green <[email protected]>
>>>> Signed-off-by: Georgi Djakov <[email protected]>
>>>> ---
>>>>
>>>> Hi Greg,
>>>>
>>>> If not too late, could you please take this patch into char-misc-next.
>>>> It is adding the first consumer of the interconnect API. We are just
>>>> getting the code in place, without making it functional yet, as some
>>>> DT bits are still needed to actually enable it. We have Rob's Ack to
>>>> merge this together with the interconnect code. This patch has already
>>>> spent some time in linux-next without any issues.
>>>
>>> I have a question about the interconnect code. Last week I saw a
>>> presentation about the resctrl/RDT code from ARM that is coming (MPAM),
>>> and it really looks like the same functionality as this interconnect
>>> code. In fact, this code looks like the existing resctrl stuff, right?
>>
>> Thanks for the question! It's nice that MPAM is moving forward. When i
>> looked into the MPAM draft spec an year ago, it was an optional
>> extension mentioning mostly use-cases with VMs on server systems.
>>
>> But anyway, MPAM is only available for ARMv8.2+ cores as an optional
>> extension and aarch32 is not supported. In contrast to that, the
>> interconnect code is generic and does not put any limitations on the
>> platform/architecture that can use it - just the platform specific
>> implementation would be different. We have discussed in that past that
>> it can be used even on x86 platforms to provide hints to firmware.
>
> Yes, but resctrl is arch independant. It's not the "backend" that I'm
> concerned about, it's the userspace and in-kernel api that I worry
> about.
Agree that resctrl is now arch independent, but it looks to me that
resctrl serves for a different purpose. It may sound that the have
similarities related to bandwidth management, but they are completely
different and resctrl does not seem suitable for managing interconnects
on a system-on-chip systems. If i understand correctly, the resctrl is
about monitoring and controlling system resources like cache, memory
bandwidth, L2, L3, that are used by applications, VMs and containers in
a CPU centric approach. It does this by making use of some CPU hardware
features and expose this via a file-system to be controlled from userspace.
>>> So why shouldn't we just drop the interconnect code and use resctrl
>>> instead as it's already merged?
>>
>> I haven't seen any MPAM code so far, but i assume that we can have an
>> interconnect provider that implements this MPAM extension for systems
>> that support it (and want to use it). Currently there are people working
>> on various interconnect platform drivers from 5 different SoC vendors
>> and we have agreed to use a common DT bindings (and API). I doubt that
>> even a single one of these platforms is based on v8.2+. Probably such
>> SoCs would be coming in the future and then i expect people making use
>> of MPAM in some interconnect provider driver.
>
> Again, don't focus on MPAM as-is, it's the resctrl api that I would like
> to see explained why interconnect can't use.
While resctrl can work for managing for example a CPU to memory
bandwidth for processes, this is not enough for the bigger picture if
you have a whole system-on-chip topology with distributed hardware
components talking to each other. The "Linux" CPU might not be even the
central arbiter in such topologies. The interconnect code does not
support interaction with userspace.
Some reasons why the interconnect code can't use the resctrl api now:
- The distributed hardware components need to express their needs for
which interconnect path and how much bandwidth they want to use, in
order to tune the whole system's (system-on-chip) performance in the
most optimal state. The interconnect code does this with a
consumer-provider API.
- Custom aggregation of the consumer requests needs to be done based on
the topology. SoCs may use different aggregation formula, depending for
example on whether they are using a simple hierarchical bus, a crossbar
or some network-on-chip. When the NoC concept is used, there is
interleaved traffic that needs to be aggregated and an interconnect path
can span across multiple clock domains (allowing each functional unit to
have it's own clock domain).
- Support for complex topologies is needed - multi-tiered buses with
devices having multiple paths between each other. The device may choose
which path to use depending on the needs or use even multiple for
load-balancing.
- A topology changes should be supported with an API - there are FPGA
boards that can change their topology.
I looked at the existing resctrl code and at some in-flight patches, but
it doesn't feel right to me to use resctrl. I don't think much of it can
be reused or extended without very significant changes that would
probably twist resctrl away from it's original purpose (and maybe
conflict with it).
TL;DR: The functionality of resctrl and interconnect code is different -
resctrl seems to be about monitoring/managing/enforcing the usage of
resources shared by one or more CPUs, the interconnect code is more
about system-on-chip systems allowing various modules (GPU, DSP, modem,
WiFi, bluetooth, {en,de}coders, camera, ethernet, etc.) to express their
bandwidth needs in order to improve the power efficiency of the whole SoC.
Hope that this addressed your concerns.
Thanks,
Georgi