2014-12-23 10:40:10

by Vince Hsu

[permalink] [raw]
Subject: [PATCH 0/11] Add suspend/resume support for GK20A

Hi,

This series includes some pieces of fixes to complete the GK20A power
on/off sequences and add the suspend/resume support.

The patches 1/11 - 4/11 are based on the linux-next-20141219.
The patches 5/11 - 11/11 are based on the branch "linux-3.19" of Ben Skeggs's
tree (http://cgit.freedesktop.org/~darktama/nouveau).

Thanks,
Vince

Vince Hsu (4): (linux-next-20141219)
ARM: tegra: add function to control the GPU rail clamp
memory: tegra: add mc flush support
memory: tegra: add flush operation for Tegra124 memory clients
ARM: tegra: add mc node for Tegra124 GPU

arch/arm/boot/dts/tegra124.dtsi | 1 +
drivers/memory/tegra/mc.c | 21 +++++++++++
drivers/memory/tegra/tegra124.c | 82 +++++++++++++++++++++++++++++++++++++++++
drivers/soc/tegra/pmc.c | 34 +++++++++++------
include/soc/tegra/mc.h | 23 +++++++++++-
include/soc/tegra/pmc.h | 2 +
6 files changed, 151 insertions(+), 12 deletions(-)

Vince Hsu (7): (linux-3.19 / http://cgit.freedesktop.org/~darktama/nouveau)
platform: switch to the new gpu rail clamping function
platform: complete the power up/down sequence
instmem: make nv50_instmem_priv public
instmem: add dummy support for GK20A
drm: export some variable and functions to resue the PM functions
platform: add suspend/resume support
platform: add PM runtime suspend/resume support

drm/Kbuild | 1 +
drm/core/subdev/instmem/nv50.h | 1 +
drm/nouveau_drm.c | 16 ++--
drm/nouveau_drm.h | 2 +
drm/nouveau_platform.c | 174 ++++++++++++++++++++++++++++++++++++++++-
drm/nouveau_platform.h | 3 +
nvkm/engine/device/nve0.c | 2 +-
nvkm/include/subdev/instmem.h | 1 +
nvkm/subdev/instmem/gk20a.c | 70 +++++++++++++++++
nvkm/subdev/instmem/nv50.c | 9 +--
nvkm/subdev/instmem/nv50.h | 14 ++++
11 files changed, 279 insertions(+), 14 deletions(-)
create mode 120000 drm/core/subdev/instmem/nv50.h
create mode 100644 nvkm/subdev/instmem/gk20a.c
create mode 100644 nvkm/subdev/instmem/nv50.h

--
1.9.1


2014-12-23 10:40:13

by Vince Hsu

[permalink] [raw]
Subject: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

The Tegra124 and later Tegra SoCs have a sepatate rail gating register
to enable/disable the clamp. The original function
tegra_powergate_remove_clamping() is not sufficient for the enable
function. So add a new function which is dedicated to the GPU rail
gating. Also don't refer to the powergate ID since the GPU ID makes no
sense here.

Signed-off-by: Vince Hsu <[email protected]>
---
drivers/soc/tegra/pmc.c | 34 +++++++++++++++++++++++-----------
include/soc/tegra/pmc.h | 2 ++
2 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
index a2c0ceb95f8f..7798c530ead1 100644
--- a/drivers/soc/tegra/pmc.c
+++ b/drivers/soc/tegra/pmc.c
@@ -225,17 +225,6 @@ int tegra_powergate_remove_clamping(int id)
return -EINVAL;

/*
- * The Tegra124 GPU has a separate register (with different semantics)
- * to remove clamps.
- */
- if (tegra_get_chip_id() == TEGRA124) {
- if (id == TEGRA_POWERGATE_3D) {
- tegra_pmc_writel(0, GPU_RG_CNTRL);
- return 0;
- }
- }
-
- /*
* Tegra 2 has a bug where PCIE and VDE clamping masks are
* swapped relatively to the partition ids
*/
@@ -253,6 +242,29 @@ int tegra_powergate_remove_clamping(int id)
EXPORT_SYMBOL(tegra_powergate_remove_clamping);

/**
+ * tegra_powergate_gpu_set_clamping - control GPU-SOC clamps
+ *
+ * The post-Tegra114 chips have a separate rail gating register to configure
+ * clamps.
+ *
+ * @assert: true to assert clamp, and false to remove
+ */
+int tegra_powergate_gpu_set_clamping(bool assert)
+{
+ if (!pmc->soc)
+ return -EINVAL;
+
+ if (tegra_get_chip_id() == TEGRA124) {
+ tegra_pmc_writel(assert ? 1 : 0, GPU_RG_CNTRL);
+ tegra_pmc_readl(GPU_RG_CNTRL);
+ return 0;
+ }
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL(tegra_powergate_gpu_set_clamping);
+
+/**
* tegra_powergate_sequence_power_up() - power up partition
* @id: partition ID
* @clk: clock for partition
diff --git a/include/soc/tegra/pmc.h b/include/soc/tegra/pmc.h
index 65a93273e72f..53d620525a9e 100644
--- a/include/soc/tegra/pmc.h
+++ b/include/soc/tegra/pmc.h
@@ -109,6 +109,8 @@ int tegra_powergate_is_powered(int id);
int tegra_powergate_power_on(int id);
int tegra_powergate_power_off(int id);
int tegra_powergate_remove_clamping(int id);
+/* Only for Tegra124 and later */
+int tegra_powergate_gpu_set_clamping(bool assert);

/* Must be called with clk disabled, and returns with clk enabled */
int tegra_powergate_sequence_power_up(int id, struct clk *clk,
--
1.9.1

2014-12-23 10:40:24

by Vince Hsu

[permalink] [raw]
Subject: [PATCH 4/11] ARM: tegra: add mc node for Tegra124 GPU

The Tegra124 GPU needs the memory controller for the memory flush operatoin.
So add the node reference of memory controller in device tree.

Signed-off-by: Vince Hsu <[email protected]>
---
arch/arm/boot/dts/tegra124.dtsi | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 4be06c6ea0c8..4109c4548b55 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -187,6 +187,7 @@
clock-names = "gpu", "pwr";
resets = <&tegra_car 184>;
reset-names = "gpu";
+ mc = <&mc TEGRA_SWGROUP_GPU>;
status = "disabled";
};

--
1.9.1

2014-12-23 10:40:29

by Vince Hsu

[permalink] [raw]
Subject: [PATCH nouveau 07/11] instmem: make nv50_instmem_priv public

The GK20A needs to create a dummy instemem subdev to avoid suspend/resume
problem. So make the nv50_instmem_priv non-static for now.

Signed-off-by: Vince Hsu <[email protected]>
---
drm/core/subdev/instmem/nv50.h | 1 +
nvkm/subdev/instmem/nv50.c | 9 ++-------
nvkm/subdev/instmem/nv50.h | 14 ++++++++++++++
3 files changed, 17 insertions(+), 7 deletions(-)
create mode 120000 drm/core/subdev/instmem/nv50.h
create mode 100644 nvkm/subdev/instmem/nv50.h

diff --git a/drm/core/subdev/instmem/nv50.h b/drm/core/subdev/instmem/nv50.h
new file mode 120000
index 000000000000..043e22aee880
--- /dev/null
+++ b/drm/core/subdev/instmem/nv50.h
@@ -0,0 +1 @@
+../../../../nvkm/subdev/instmem/nv50.h
\ No newline at end of file
diff --git a/nvkm/subdev/instmem/nv50.c b/nvkm/subdev/instmem/nv50.c
index 7cb3b098a08d..66428b1c2394 100644
--- a/nvkm/subdev/instmem/nv50.c
+++ b/nvkm/subdev/instmem/nv50.c
@@ -25,14 +25,9 @@
#include <subdev/fb.h>
#include <core/mm.h>

+#include "nv50.h"
#include "priv.h"

-struct nv50_instmem_priv {
- struct nouveau_instmem base;
- spinlock_t lock;
- u64 addr;
-};
-
struct nv50_instobj_priv {
struct nouveau_instobj base;
struct nouveau_mem *mem;
@@ -117,7 +112,7 @@ nv50_instobj_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
return 0;
}

-static struct nouveau_instobj_impl
+struct nouveau_instobj_impl
nv50_instobj_oclass = {
.base.ofuncs = &(struct nouveau_ofuncs) {
.ctor = nv50_instobj_ctor,
diff --git a/nvkm/subdev/instmem/nv50.h b/nvkm/subdev/instmem/nv50.h
new file mode 100644
index 000000000000..bff03e3807ea
--- /dev/null
+++ b/nvkm/subdev/instmem/nv50.h
@@ -0,0 +1,14 @@
+#ifndef __NVKM_INSTMEM_NV50_H__
+#define __NVKM_INSTMEM_NV50_H__
+
+#include "priv.h"
+
+struct nv50_instmem_priv {
+ struct nouveau_instmem base;
+ spinlock_t lock;
+ u64 addr;
+};
+
+extern struct nouveau_instobj_impl nv50_instobj_oclass;
+
+#endif
--
1.9.1

2014-12-23 10:40:36

by Vince Hsu

[permalink] [raw]
Subject: [PATCH nouveau 10/11] platform: add suspend/resume support

We reuse most of the suspend/resume functions of the dGPU for nouveau
platform device. Only the power up/down sequences need to be handled
separately.

Signed-off-by: Vince Hsu <[email protected]>
---
drm/nouveau_platform.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)

diff --git a/drm/nouveau_platform.c b/drm/nouveau_platform.c
index 527fe2358fc9..51bfce59c498 100644
--- a/drm/nouveau_platform.c
+++ b/drm/nouveau_platform.c
@@ -211,6 +211,65 @@ static const struct of_device_id nouveau_platform_match[] = {
MODULE_DEVICE_TABLE(of, nouveau_platform_match);
#endif

+static int
+nouveau_platform_pmops_suspend(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct drm_device *drm_dev = platform_get_drvdata(pdev);
+ struct nouveau_drm *drm = nouveau_drm(drm_dev);
+ struct nouveau_device *device = nvkm_device(&drm->device);
+ struct nouveau_platform_gpu *gpu = nv_device_to_platform(device)->gpu;
+ int ret;
+
+ if (drm_dev->switch_power_state == DRM_SWITCH_POWER_OFF ||
+ drm_dev->switch_power_state == DRM_SWITCH_POWER_DYNAMIC_OFF)
+ return 0;
+
+ ret = nouveau_do_suspend(drm_dev, false);
+ if (ret)
+ return ret;
+
+ ret = nouveau_platform_power_down(gpu);
+ if (ret) {
+ dev_err(dev, "failed to power down gpu (err:%d)\n", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int
+nouveau_platform_pmops_resume(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct drm_device *drm_dev = platform_get_drvdata(pdev);
+ struct nouveau_drm *drm = nouveau_drm(drm_dev);
+ struct nouveau_device *device = nvkm_device(&drm->device);
+ struct nouveau_platform_gpu *gpu = nv_device_to_platform(device)->gpu;
+ int ret;
+
+ if (drm_dev->switch_power_state == DRM_SWITCH_POWER_OFF ||
+ drm_dev->switch_power_state == DRM_SWITCH_POWER_DYNAMIC_OFF)
+ return 0;
+
+ ret = nouveau_platform_power_up(gpu);
+ if (ret) {
+ dev_err(dev, "failed to power up gpu\n");
+ return ret;
+ }
+
+ ret = nouveau_do_resume(drm_dev, false);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static const struct dev_pm_ops nouveau_platform_pm_ops = {
+ .suspend = nouveau_platform_pmops_suspend,
+ .resume = nouveau_platform_pmops_resume,
+};
+
struct platform_driver nouveau_platform_driver = {
.driver = {
.name = "nouveau",
@@ -218,6 +277,7 @@ struct platform_driver nouveau_platform_driver = {
},
.probe = nouveau_platform_probe,
.remove = nouveau_platform_remove,
+ .driver.pm = &nouveau_platform_pm_ops,
};

module_platform_driver(nouveau_platform_driver);
--
1.9.1

2014-12-23 10:40:40

by Vince Hsu

[permalink] [raw]
Subject: [PATCH nouveau 11/11] platform: add PM runtime suspend/resume support

The runtime suspend/resume flow is almost the same as the suspend/resume
except the the arguments passed to nouveau_do_{suspend,resume}.

Signed-off-by: Vince Hsu <[email protected]>
---
drm/nouveau_platform.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)

diff --git a/drm/nouveau_platform.c b/drm/nouveau_platform.c
index 51bfce59c498..c71c62081b9c 100644
--- a/drm/nouveau_platform.c
+++ b/drm/nouveau_platform.c
@@ -177,6 +177,9 @@ static int nouveau_platform_probe(struct platform_device *pdev)
if (err < 0)
goto err_unref;

+ if (nouveau_runtime_pm != 0)
+ pm_runtime_enable(&pdev->dev);
+
return 0;

err_unref:
@@ -197,6 +200,9 @@ static int nouveau_platform_remove(struct platform_device *pdev)
struct nouveau_device *device = nvkm_device(&drm->device);
struct nouveau_platform_gpu *gpu = nv_device_to_platform(device)->gpu;

+ if (nouveau_runtime_pm != 0)
+ pm_runtime_disable(&pdev->dev);
+
nouveau_drm_device_remove(drm_dev);

return nouveau_platform_power_down(gpu);
@@ -265,9 +271,73 @@ nouveau_platform_pmops_resume(struct device *dev)
return 0;
}

+static int
+nouveau_platform_pmops_runtime_suspend(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct drm_device *drm_dev = platform_get_drvdata(pdev);
+ struct nouveau_drm *drm = nouveau_drm(drm_dev);
+ struct nouveau_device *device = nvkm_device(&drm->device);
+ struct nouveau_platform_gpu *gpu = nv_device_to_platform(device)->gpu;
+ int ret;
+
+ ret = nouveau_do_suspend(drm_dev, true);
+ if (ret) {
+ dev_err(dev, "failed to suspend drm device (err:%d)\n", ret);
+ return ret;
+ }
+
+ ret = nouveau_platform_power_down(gpu);
+ if (ret) {
+ dev_err(dev, "failed to power down gpu (err:%d)\n", ret);
+ return ret;
+ }
+
+ drm_dev->switch_power_state = DRM_SWITCH_POWER_DYNAMIC_OFF;
+ return ret;
+}
+
+static int
+nouveau_platform_pmops_runtime_resume(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct drm_device *drm_dev = platform_get_drvdata(pdev);
+ struct nouveau_drm *drm = nouveau_drm(drm_dev);
+ struct nouveau_device *device = nvkm_device(&drm->device);
+ struct nouveau_platform_gpu *gpu = nv_device_to_platform(device)->gpu;
+ int ret;
+
+ ret = nouveau_platform_power_up(gpu);
+ if (ret) {
+ dev_err(dev, "failed to power up gpu\n");
+ return ret;
+ }
+
+ ret = nouveau_do_resume(drm_dev, true);
+ if (ret) {
+ dev_err(dev, "failed to resume\n");
+ return ret;
+ }
+
+ drm_dev->switch_power_state = DRM_SWITCH_POWER_ON;
+ return ret;
+}
+
+static int
+nouveau_platform_pmops_runtime_idle(struct device *dev)
+{
+ pm_runtime_mark_last_busy(dev);
+ pm_runtime_autosuspend(dev);
+
+ return 1;
+}
+
static const struct dev_pm_ops nouveau_platform_pm_ops = {
.suspend = nouveau_platform_pmops_suspend,
.resume = nouveau_platform_pmops_resume,
+ .runtime_suspend = nouveau_platform_pmops_runtime_suspend,
+ .runtime_resume = nouveau_platform_pmops_runtime_resume,
+ .runtime_idle = nouveau_platform_pmops_runtime_idle,
};

struct platform_driver nouveau_platform_driver = {
--
1.9.1

2014-12-23 10:40:34

by Vince Hsu

[permalink] [raw]
Subject: [PATCH nouveau 08/11] instmem: add dummy support for GK20A

This is a workaround to avoid the instmem backup/restore during the suspend
and resume process in nv50 instemem driver.

Signed-off-by: Vince Hsu <[email protected]>
---
drm/Kbuild | 1 +
nvkm/engine/device/nve0.c | 2 +-
nvkm/include/subdev/instmem.h | 1 +
nvkm/subdev/instmem/gk20a.c | 70 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 73 insertions(+), 1 deletion(-)
create mode 100644 nvkm/subdev/instmem/gk20a.c

diff --git a/drm/Kbuild b/drm/Kbuild
index 6461e3565afe..ea40cd653c7c 100644
--- a/drm/Kbuild
+++ b/drm/Kbuild
@@ -176,6 +176,7 @@ nouveau-y += core/subdev/instmem/base.o
nouveau-y += core/subdev/instmem/nv04.o
nouveau-y += core/subdev/instmem/nv40.o
nouveau-y += core/subdev/instmem/nv50.o
+nouveau-y += core/subdev/instmem/gk20a.o
nouveau-y += core/subdev/ltc/base.o
nouveau-y += core/subdev/ltc/gf100.o
nouveau-y += core/subdev/ltc/gk104.o
diff --git a/nvkm/engine/device/nve0.c b/nvkm/engine/device/nve0.c
index 732922690653..fcbdc5259c7c 100644
--- a/nvkm/engine/device/nve0.c
+++ b/nvkm/engine/device/nve0.c
@@ -170,7 +170,7 @@ nve0_identify(struct nouveau_device *device)
device->oclass[NVDEV_SUBDEV_FB ] = gk20a_fb_oclass;
device->oclass[NVDEV_SUBDEV_LTC ] = gk104_ltc_oclass;
device->oclass[NVDEV_SUBDEV_IBUS ] = &gk20a_ibus_oclass;
- device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
+ device->oclass[NVDEV_SUBDEV_INSTMEM] = gk20a_instmem_oclass;
device->oclass[NVDEV_SUBDEV_VM ] = &nvc0_vmmgr_oclass;
device->oclass[NVDEV_SUBDEV_BAR ] = &gk20a_bar_oclass;
device->oclass[NVDEV_ENGINE_DMAOBJ ] = nvd0_dmaeng_oclass;
diff --git a/nvkm/include/subdev/instmem.h b/nvkm/include/subdev/instmem.h
index c1df26f3230c..6264660bedce 100644
--- a/nvkm/include/subdev/instmem.h
+++ b/nvkm/include/subdev/instmem.h
@@ -48,5 +48,6 @@ nouveau_instmem(void *obj)
extern struct nouveau_oclass *nv04_instmem_oclass;
extern struct nouveau_oclass *nv40_instmem_oclass;
extern struct nouveau_oclass *nv50_instmem_oclass;
+extern struct nouveau_oclass *gk20a_instmem_oclass;

#endif
diff --git a/nvkm/subdev/instmem/gk20a.c b/nvkm/subdev/instmem/gk20a.c
new file mode 100644
index 000000000000..5e072d6e743f
--- /dev/null
+++ b/nvkm/subdev/instmem/gk20a.c
@@ -0,0 +1,70 @@
+/*
+ * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "nv50.h"
+#include "priv.h"
+
+static int
+gk20a_instmem_fini(struct nouveau_object *object, bool suspend)
+{
+ struct nouveau_instmem *imem = (void *)object;
+
+ return nouveau_subdev_fini(&imem->base, suspend);
+}
+
+static int
+gk20a_instmem_init(struct nouveau_object *object)
+{
+ struct nouveau_instmem *imem = (void *)object;
+
+ return nouveau_subdev_init(&imem->base);
+}
+
+static int
+gk20a_instmem_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
+ struct nouveau_oclass *oclass, void *data, u32 size,
+ struct nouveau_object **pobject)
+{
+ struct nv50_instmem_priv *priv;
+ int ret;
+
+ ret = nouveau_instmem_create(parent, engine, oclass, &priv);
+ *pobject = nv_object(priv);
+ if (ret)
+ return ret;
+
+ spin_lock_init(&priv->lock);
+ return 0;
+}
+
+struct nouveau_oclass *
+gk20a_instmem_oclass = &(struct nouveau_instmem_impl) {
+ .base.handle = NV_SUBDEV(INSTMEM, 0x50),
+ .base.ofuncs = &(struct nouveau_ofuncs) {
+ .ctor = gk20a_instmem_ctor,
+ .dtor = _nouveau_instmem_dtor,
+ .init = gk20a_instmem_init,
+ .fini = gk20a_instmem_fini,
+ },
+ .instobj = &nv50_instobj_oclass.base,
+}.base;
--
1.9.1

2014-12-23 10:41:24

by Vince Hsu

[permalink] [raw]
Subject: [PATCH nouveau 09/11] drm: export some variable and functions to resue the PM functions

This patch adds some checks in the suspend/resume functions to distinguish
the dGPU and mobile GPU and exports some variables/functions so that the
nouveau platform device can reuse them.

Signed-off-by: Vince Hsu <[email protected]>
---
drm/nouveau_drm.c | 16 +++++++++++-----
drm/nouveau_drm.h | 2 ++
2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drm/nouveau_drm.c b/drm/nouveau_drm.c
index afb93bb72f97..0ed99ef80211 100644
--- a/drm/nouveau_drm.c
+++ b/drm/nouveau_drm.c
@@ -72,6 +72,7 @@ module_param_named(modeset, nouveau_modeset, int, 0400);

MODULE_PARM_DESC(runpm, "disable (0), force enable (1), optimus only default (-1)");
int nouveau_runtime_pm = -1;
+EXPORT_SYMBOL(nouveau_runtime_pm);
module_param_named(runpm, nouveau_runtime_pm, int, 0400);

static struct drm_driver driver_stub;
@@ -543,7 +544,7 @@ nouveau_drm_remove(struct pci_dev *pdev)
nouveau_drm_device_remove(dev);
}

-static int
+int
nouveau_do_suspend(struct drm_device *dev, bool runtime)
{
struct nouveau_drm *drm = nouveau_drm(dev);
@@ -559,8 +560,10 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
return ret;
}

- NV_INFO(drm, "evicting buffers...\n");
- ttm_bo_evict_mm(&drm->ttm.bdev, TTM_PL_VRAM);
+ if (dev->pdev) {
+ NV_INFO(drm, "evicting buffers...\n");
+ ttm_bo_evict_mm(&drm->ttm.bdev, TTM_PL_VRAM);
+ }

NV_INFO(drm, "waiting for kernel channels to go idle...\n");
if (drm->cechan) {
@@ -612,8 +615,9 @@ fail_display:
}
return ret;
}
+EXPORT_SYMBOL(nouveau_do_suspend);

-static int
+int
nouveau_do_resume(struct drm_device *dev, bool runtime)
{
struct nouveau_drm *drm = nouveau_drm(dev);
@@ -635,7 +639,8 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
nvif_client_resume(&cli->base);
}

- nouveau_run_vbios_init(dev);
+ if (dev->pdev)
+ nouveau_run_vbios_init(dev);

if (dev->mode_config.num_crtc) {
NV_INFO(drm, "resuming display...\n");
@@ -646,6 +651,7 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)

return 0;
}
+EXPORT_SYMBOL(nouveau_do_resume);

int
nouveau_pmops_suspend(struct device *dev)
diff --git a/drm/nouveau_drm.h b/drm/nouveau_drm.h
index 8ae36f265fb8..897d295dd1e3 100644
--- a/drm/nouveau_drm.h
+++ b/drm/nouveau_drm.h
@@ -177,6 +177,8 @@ nouveau_drm(struct drm_device *dev)

int nouveau_pmops_suspend(struct device *);
int nouveau_pmops_resume(struct device *);
+int nouveau_do_suspend(struct drm_device *dev, bool runtime);
+int nouveau_do_resume(struct drm_device *dev, bool runtime);

#define nouveau_platform_device_create(p, u) \
nouveau_platform_device_create_(p, sizeof(**u), (void **)u)
--
1.9.1

2014-12-23 10:40:23

by Vince Hsu

[permalink] [raw]
Subject: [PATCH nouveau 05/11] platform: switch to the new gpu rail clamping function

Signed-off-by: Vince Hsu <[email protected]>
---
drm/nouveau_platform.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drm/nouveau_platform.c b/drm/nouveau_platform.c
index b307bbedd4c4..68788b17a45c 100644
--- a/drm/nouveau_platform.c
+++ b/drm/nouveau_platform.c
@@ -53,7 +53,7 @@ static int nouveau_platform_power_up(struct nouveau_platform_gpu *gpu)
reset_control_assert(gpu->rst);
udelay(10);

- err = tegra_powergate_remove_clamping(TEGRA_POWERGATE_3D);
+ err = tegra_powergate_gpu_set_clamping(false);
if (err)
goto err_clamp;
udelay(10);
--
1.9.1

2014-12-23 10:41:56

by Vince Hsu

[permalink] [raw]
Subject: [PATCH nouveau 06/11] platform: complete the power up/down sequence

This patch adds some missing pieces of the rail gaing/ungating sequence that
can improve the stability in theory.

Signed-off-by: Vince Hsu <[email protected]>
---
drm/nouveau_platform.c | 42 ++++++++++++++++++++++++++++++++++++++++++
drm/nouveau_platform.h | 3 +++
2 files changed, 45 insertions(+)

diff --git a/drm/nouveau_platform.c b/drm/nouveau_platform.c
index 68788b17a45c..527fe2358fc9 100644
--- a/drm/nouveau_platform.c
+++ b/drm/nouveau_platform.c
@@ -25,9 +25,11 @@
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/of.h>
+#include <linux/of_platform.h>
#include <linux/reset.h>
#include <linux/regulator/consumer.h>
#include <soc/tegra/fuse.h>
+#include <soc/tegra/mc.h>
#include <soc/tegra/pmc.h>

#include "nouveau_drm.h"
@@ -61,6 +63,9 @@ static int nouveau_platform_power_up(struct nouveau_platform_gpu *gpu)
reset_control_deassert(gpu->rst);
udelay(10);

+ tegra_mc_flush(gpu->mc, gpu->swgroup, false);
+ udelay(10);
+
return 0;

err_clamp:
@@ -77,6 +82,14 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
{
int err;

+ tegra_mc_flush(gpu->mc, gpu->swgroup, true);
+ udelay(10);
+
+ err = tegra_powergate_gpu_set_clamping(true);
+ if (err)
+ return err;
+ udelay(10);
+
reset_control_assert(gpu->rst);
udelay(10);

@@ -91,6 +104,31 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
return 0;
}

+static int nouveau_platform_get_mc(struct device *dev,
+ struct tegra_mc **mc, unsigned int *swgroup)
+{
+ struct of_phandle_args args;
+ struct platform_device *pdev;
+ int ret;
+
+ ret = of_parse_phandle_with_fixed_args(dev->of_node, "mc",
+ 1, 0, &args);
+ if (ret)
+ return ret;
+
+ pdev = of_find_device_by_node(args.np);
+ if (!pdev)
+ return -EINVAL;
+
+ *mc = platform_get_drvdata(pdev);
+ if (!*mc)
+ return -EINVAL;
+
+ *swgroup = args.args[0];
+
+ return 0;
+}
+
static int nouveau_platform_probe(struct platform_device *pdev)
{
struct nouveau_platform_gpu *gpu;
@@ -118,6 +156,10 @@ static int nouveau_platform_probe(struct platform_device *pdev)
if (IS_ERR(gpu->clk_pwr))
return PTR_ERR(gpu->clk_pwr);

+ err = nouveau_platform_get_mc(&pdev->dev, &gpu->mc, &gpu->swgroup);
+ if (err)
+ return err;
+
err = nouveau_platform_power_up(gpu);
if (err)
return err;
diff --git a/drm/nouveau_platform.h b/drm/nouveau_platform.h
index 58c28b5653d5..969dbddd786d 100644
--- a/drm/nouveau_platform.h
+++ b/drm/nouveau_platform.h
@@ -35,6 +35,9 @@ struct nouveau_platform_gpu {
struct clk *clk_pwr;

struct regulator *vdd;
+
+ struct tegra_mc *mc;
+ unsigned int swgroup;
};

struct nouveau_platform_device {
--
1.9.1

2014-12-23 10:40:18

by Vince Hsu

[permalink] [raw]
Subject: [PATCH 3/11] memory: tegra: add flush operation for Tegra124 memory clients

Signed-off-by: Vince Hsu <[email protected]>
---
drivers/memory/tegra/tegra124.c | 82 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 82 insertions(+)

diff --git a/drivers/memory/tegra/tegra124.c b/drivers/memory/tegra/tegra124.c
index 278d40b854c1..036935743a0a 100644
--- a/drivers/memory/tegra/tegra124.c
+++ b/drivers/memory/tegra/tegra124.c
@@ -6,6 +6,7 @@
* published by the Free Software Foundation.
*/

+#include <linux/delay.h>
#include <linux/of.h>
#include <linux/mm.h>

@@ -959,7 +960,85 @@ static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
{ .swgroup = TEGRA_SWGROUP_VI, .reg = 0x280 },
};

+static const struct tegra_mc_hr tegra124_mc_hr[] = {
+ {TEGRA_SWGROUP_AFI, 0x200, 0x200, 0},
+ {TEGRA_SWGROUP_AVPC, 0x200, 0x200, 1},
+ {TEGRA_SWGROUP_DC, 0x200, 0x200, 2},
+ {TEGRA_SWGROUP_DCB, 0x200, 0x200, 3},
+ {TEGRA_SWGROUP_HC, 0x200, 0x200, 6},
+ {TEGRA_SWGROUP_HDA, 0x200, 0x200, 7},
+ {TEGRA_SWGROUP_ISP2, 0x200, 0x200, 8},
+ {TEGRA_SWGROUP_MPCORE, 0x200, 0x200, 9},
+ {TEGRA_SWGROUP_MPCORELP, 0x200, 0x200, 10},
+ {TEGRA_SWGROUP_MSENC, 0x200, 0x200, 11},
+ {TEGRA_SWGROUP_PPCS, 0x200, 0x200, 14},
+ {TEGRA_SWGROUP_SATA, 0x200, 0x200, 15},
+ {TEGRA_SWGROUP_VDE, 0x200, 0x200, 16},
+ {TEGRA_SWGROUP_VI, 0x200, 0x200, 17},
+ {TEGRA_SWGROUP_VIC, 0x200, 0x200, 18},
+ {TEGRA_SWGROUP_XUSB_HOST, 0x200, 0x200, 19},
+ {TEGRA_SWGROUP_XUSB_DEV, 0x200, 0x200, 20},
+ {TEGRA_SWGROUP_TSEC, 0x200, 0x200, 22},
+ {TEGRA_SWGROUP_SDMMC1A, 0x200, 0x200, 29},
+ {TEGRA_SWGROUP_SDMMC2A, 0x200, 0x200, 30},
+ {TEGRA_SWGROUP_SDMMC3A, 0x200, 0x200, 31},
+ {TEGRA_SWGROUP_SDMMC4A, 0x970, 0x974, 0},
+ {TEGRA_SWGROUP_ISP2B, 0x970, 0x974, 1},
+ {TEGRA_SWGROUP_GPU, 0x970, 0x974, 2},
+};
+
#ifdef CONFIG_ARCH_TEGRA_124_SOC
+
+static bool tegra124_stable_hotreset_check(struct tegra_mc *mc,
+ u32 reg, u32 *stat)
+{
+ int i;
+ u32 cur_stat;
+ u32 prv_stat;
+
+ prv_stat = mc_readl(mc, reg);
+ for (i = 0; i < 5; i++) {
+ cur_stat = mc_readl(mc, reg);
+ if (cur_stat != prv_stat)
+ return false;
+ }
+ *stat = cur_stat;
+ return true;
+}
+
+static int tegra124_mc_flush(struct tegra_mc *mc,
+ const struct tegra_mc_hr *hr_client, bool enable)
+{
+ u32 val;
+
+ if (!mc || !hr_client)
+ return -EINVAL;
+
+ val = mc_readl(mc, hr_client->ctrl);
+ if (enable)
+ val |= BIT(hr_client->bit);
+ else
+ val &= ~BIT(hr_client->bit);
+ mc_writel(mc, val, hr_client->ctrl);
+ mc_readl(mc, hr_client->ctrl);
+
+ /* poll till the flush is done */
+ if (enable) {
+ do {
+ udelay(10);
+ val = 0;
+ if (!tegra124_stable_hotreset_check(mc, hr_client->status, &val))
+ continue;
+ } while (!(val & BIT(hr_client->bit)));
+ }
+
+ return 0;
+}
+
+static const struct tegra_mc_ops tegra124_mc_ops = {
+ .flush = tegra124_mc_flush,
+};
+
static void tegra124_flush_dcache(struct page *page, unsigned long offset,
size_t size)
{
@@ -991,5 +1070,8 @@ const struct tegra_mc_soc tegra124_mc_soc = {
.num_address_bits = 34,
.atom_size = 32,
.smmu = &tegra124_smmu_soc,
+ .hr_clients = tegra124_mc_hr,
+ .num_hr_clients = ARRAY_SIZE(tegra124_mc_hr),
+ .ops = &tegra124_mc_ops,
};
#endif /* CONFIG_ARCH_TEGRA_124_SOC */
--
1.9.1

2014-12-23 10:42:49

by Vince Hsu

[permalink] [raw]
Subject: [PATCH 2/11] memory: tegra: add mc flush support

The flush operation of memory clients is needed for various IP blocks in
the Tegra SoCs to perform a clean reset.

Signed-off-by: Vince Hsu <[email protected]>
---
drivers/memory/tegra/mc.c | 21 +++++++++++++++++++++
include/soc/tegra/mc.h | 23 ++++++++++++++++++++++-
2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index fe3c44e7e1d1..a2928b4b26fe 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -62,6 +62,27 @@ static const struct of_device_id tegra_mc_of_match[] = {
};
MODULE_DEVICE_TABLE(of, tegra_mc_of_match);

+int tegra_mc_flush(struct tegra_mc *mc, unsigned int swgroup, bool enable)
+{
+ int i;
+ const struct tegra_mc_hr *client;
+
+ if (!mc || !mc->soc->hr_clients ||
+ !mc->soc->ops || !mc->soc->ops->flush)
+ return -EINVAL;;
+
+ client = mc->soc->hr_clients;
+
+ for (i = 0; i < mc->soc->num_hr_clients; i++, client++) {
+ if (swgroup == client->swgroup) {
+ return mc->soc->ops->flush(mc, client, enable);
+ }
+ }
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL(tegra_mc_flush);
+
static int tegra_mc_setup_latency_allowance(struct tegra_mc *mc)
{
unsigned long long tick;
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index 63deb8d9f82a..4894aec7d2a0 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -39,6 +39,21 @@ struct tegra_mc_client {
struct tegra_mc_la la;
};

+/* hot reset */
+struct tegra_mc_hr {
+ unsigned int swgroup;
+ unsigned int ctrl;
+ unsigned int status;
+ unsigned int bit;
+};
+
+struct tegra_mc;
+
+struct tegra_mc_ops {
+ int (*flush)(struct tegra_mc *mc, const struct tegra_mc_hr *hr_client,
+ bool enable);
+};
+
struct tegra_smmu_swgroup {
unsigned int swgroup;
unsigned int reg;
@@ -64,7 +79,6 @@ struct tegra_smmu_soc {
const struct tegra_smmu_ops *ops;
};

-struct tegra_mc;
struct tegra_smmu;

#ifdef CONFIG_TEGRA_IOMMU_SMMU
@@ -84,6 +98,11 @@ struct tegra_mc_soc {
const struct tegra_mc_client *clients;
unsigned int num_clients;

+ const struct tegra_mc_hr *hr_clients;
+ unsigned int num_hr_clients;
+
+ const struct tegra_mc_ops *ops;
+
const unsigned int *emem_regs;
unsigned int num_emem_regs;

@@ -104,4 +123,6 @@ struct tegra_mc {
unsigned long tick;
};

+int tegra_mc_flush(struct tegra_mc *mc, unsigned int swgroup, bool enable);
+
#endif /* __SOC_TEGRA_MC_H__ */
--
1.9.1

2014-12-23 16:39:33

by Ilia Mirkin

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH nouveau 08/11] instmem: add dummy support for GK20A

On Tue, Dec 23, 2014 at 5:40 AM, Vince Hsu <[email protected]> wrote:
> This is a workaround to avoid the instmem backup/restore during the suspend
> and resume process in nv50 instemem driver.
>
> Signed-off-by: Vince Hsu <[email protected]>
> ---
> drm/Kbuild | 1 +
> nvkm/engine/device/nve0.c | 2 +-
> nvkm/include/subdev/instmem.h | 1 +
> nvkm/subdev/instmem/gk20a.c | 70 +++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 73 insertions(+), 1 deletion(-)
> create mode 100644 nvkm/subdev/instmem/gk20a.c
>
> diff --git a/drm/Kbuild b/drm/Kbuild
> index 6461e3565afe..ea40cd653c7c 100644
> --- a/drm/Kbuild
> +++ b/drm/Kbuild
> @@ -176,6 +176,7 @@ nouveau-y += core/subdev/instmem/base.o
> nouveau-y += core/subdev/instmem/nv04.o
> nouveau-y += core/subdev/instmem/nv40.o
> nouveau-y += core/subdev/instmem/nv50.o
> +nouveau-y += core/subdev/instmem/gk20a.o
> nouveau-y += core/subdev/ltc/base.o
> nouveau-y += core/subdev/ltc/gf100.o
> nouveau-y += core/subdev/ltc/gk104.o
> diff --git a/nvkm/engine/device/nve0.c b/nvkm/engine/device/nve0.c
> index 732922690653..fcbdc5259c7c 100644
> --- a/nvkm/engine/device/nve0.c
> +++ b/nvkm/engine/device/nve0.c
> @@ -170,7 +170,7 @@ nve0_identify(struct nouveau_device *device)
> device->oclass[NVDEV_SUBDEV_FB ] = gk20a_fb_oclass;
> device->oclass[NVDEV_SUBDEV_LTC ] = gk104_ltc_oclass;
> device->oclass[NVDEV_SUBDEV_IBUS ] = &gk20a_ibus_oclass;
> - device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
> + device->oclass[NVDEV_SUBDEV_INSTMEM] = gk20a_instmem_oclass;
> device->oclass[NVDEV_SUBDEV_VM ] = &nvc0_vmmgr_oclass;
> device->oclass[NVDEV_SUBDEV_BAR ] = &gk20a_bar_oclass;
> device->oclass[NVDEV_ENGINE_DMAOBJ ] = nvd0_dmaeng_oclass;
> diff --git a/nvkm/include/subdev/instmem.h b/nvkm/include/subdev/instmem.h
> index c1df26f3230c..6264660bedce 100644
> --- a/nvkm/include/subdev/instmem.h
> +++ b/nvkm/include/subdev/instmem.h
> @@ -48,5 +48,6 @@ nouveau_instmem(void *obj)
> extern struct nouveau_oclass *nv04_instmem_oclass;
> extern struct nouveau_oclass *nv40_instmem_oclass;
> extern struct nouveau_oclass *nv50_instmem_oclass;
> +extern struct nouveau_oclass *gk20a_instmem_oclass;
>
> #endif
> diff --git a/nvkm/subdev/instmem/gk20a.c b/nvkm/subdev/instmem/gk20a.c
> new file mode 100644
> index 000000000000..5e072d6e743f
> --- /dev/null
> +++ b/nvkm/subdev/instmem/gk20a.c
> @@ -0,0 +1,70 @@
> +/*
> + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#include "nv50.h"

I'm confused... what exactly depends on nv50_instmem_priv here? Why
not just create a gk20a_instmem_priv and leave the nv50 one alone?

> +#include "priv.h"
> +
> +static int
> +gk20a_instmem_fini(struct nouveau_object *object, bool suspend)
> +{
> + struct nouveau_instmem *imem = (void *)object;
> +
> + return nouveau_subdev_fini(&imem->base, suspend);
> +}
> +
> +static int
> +gk20a_instmem_init(struct nouveau_object *object)
> +{
> + struct nouveau_instmem *imem = (void *)object;
> +
> + return nouveau_subdev_init(&imem->base);
> +}

I think the style is to just link those up directly in the class
definition when they're trivial like that, i.e. point them at
_nouveau_subdev_init and such.

> +
> +static int
> +gk20a_instmem_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
> + struct nouveau_oclass *oclass, void *data, u32 size,
> + struct nouveau_object **pobject)
> +{
> + struct nv50_instmem_priv *priv;
> + int ret;
> +
> + ret = nouveau_instmem_create(parent, engine, oclass, &priv);
> + *pobject = nv_object(priv);
> + if (ret)
> + return ret;
> +
> + spin_lock_init(&priv->lock);
> + return 0;
> +}
> +
> +struct nouveau_oclass *
> +gk20a_instmem_oclass = &(struct nouveau_instmem_impl) {
> + .base.handle = NV_SUBDEV(INSTMEM, 0x50),
> + .base.ofuncs = &(struct nouveau_ofuncs) {
> + .ctor = gk20a_instmem_ctor,
> + .dtor = _nouveau_instmem_dtor,
> + .init = gk20a_instmem_init,
> + .fini = gk20a_instmem_fini,
> + },
> + .instobj = &nv50_instobj_oclass.base,
> +}.base;
> --
> 1.9.1
>
> _______________________________________________
> Nouveau mailing list
> [email protected]
> http://lists.freedesktop.org/mailman/listinfo/nouveau

2014-12-24 02:45:21

by Vince Hsu

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH nouveau 08/11] instmem: add dummy support for GK20A

Hi,

On 12/24/2014 12:39 AM, Ilia Mirkin wrote:
> On Tue, Dec 23, 2014 at 5:40 AM, Vince Hsu <[email protected]> wrote:
>> This is a workaround to avoid the instmem backup/restore during the suspend
>> and resume process in nv50 instemem driver.
>>
>> Signed-off-by: Vince Hsu <[email protected]>
>> ---
>> drm/Kbuild | 1 +
>> nvkm/engine/device/nve0.c | 2 +-
>> nvkm/include/subdev/instmem.h | 1 +
>> nvkm/subdev/instmem/gk20a.c | 70 +++++++++++++++++++++++++++++++++++++++++++
>> 4 files changed, 73 insertions(+), 1 deletion(-)
>> create mode 100644 nvkm/subdev/instmem/gk20a.c
>>
>> diff --git a/drm/Kbuild b/drm/Kbuild
>> index 6461e3565afe..ea40cd653c7c 100644
>> --- a/drm/Kbuild
>> +++ b/drm/Kbuild
>> @@ -176,6 +176,7 @@ nouveau-y += core/subdev/instmem/base.o
>> nouveau-y += core/subdev/instmem/nv04.o
>> nouveau-y += core/subdev/instmem/nv40.o
>> nouveau-y += core/subdev/instmem/nv50.o
>> +nouveau-y += core/subdev/instmem/gk20a.o
>> nouveau-y += core/subdev/ltc/base.o
>> nouveau-y += core/subdev/ltc/gf100.o
>> nouveau-y += core/subdev/ltc/gk104.o
>> diff --git a/nvkm/engine/device/nve0.c b/nvkm/engine/device/nve0.c
>> index 732922690653..fcbdc5259c7c 100644
>> --- a/nvkm/engine/device/nve0.c
>> +++ b/nvkm/engine/device/nve0.c
>> @@ -170,7 +170,7 @@ nve0_identify(struct nouveau_device *device)
>> device->oclass[NVDEV_SUBDEV_FB ] = gk20a_fb_oclass;
>> device->oclass[NVDEV_SUBDEV_LTC ] = gk104_ltc_oclass;
>> device->oclass[NVDEV_SUBDEV_IBUS ] = &gk20a_ibus_oclass;
>> - device->oclass[NVDEV_SUBDEV_INSTMEM] = nv50_instmem_oclass;
>> + device->oclass[NVDEV_SUBDEV_INSTMEM] = gk20a_instmem_oclass;
>> device->oclass[NVDEV_SUBDEV_VM ] = &nvc0_vmmgr_oclass;
>> device->oclass[NVDEV_SUBDEV_BAR ] = &gk20a_bar_oclass;
>> device->oclass[NVDEV_ENGINE_DMAOBJ ] = nvd0_dmaeng_oclass;
>> diff --git a/nvkm/include/subdev/instmem.h b/nvkm/include/subdev/instmem.h
>> index c1df26f3230c..6264660bedce 100644
>> --- a/nvkm/include/subdev/instmem.h
>> +++ b/nvkm/include/subdev/instmem.h
>> @@ -48,5 +48,6 @@ nouveau_instmem(void *obj)
>> extern struct nouveau_oclass *nv04_instmem_oclass;
>> extern struct nouveau_oclass *nv40_instmem_oclass;
>> extern struct nouveau_oclass *nv50_instmem_oclass;
>> +extern struct nouveau_oclass *gk20a_instmem_oclass;
>>
>> #endif
>> diff --git a/nvkm/subdev/instmem/gk20a.c b/nvkm/subdev/instmem/gk20a.c
>> new file mode 100644
>> index 000000000000..5e072d6e743f
>> --- /dev/null
>> +++ b/nvkm/subdev/instmem/gk20a.c
>> @@ -0,0 +1,70 @@
>> +/*
>> + * Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#include "nv50.h"
> I'm confused... what exactly depends on nv50_instmem_priv here? Why
> not just create a gk20a_instmem_priv and leave the nv50 one alone?
You're right. Will fix in the next version.
>
>> +#include "priv.h"
>> +
>> +static int
>> +gk20a_instmem_fini(struct nouveau_object *object, bool suspend)
>> +{
>> + struct nouveau_instmem *imem = (void *)object;
>> +
>> + return nouveau_subdev_fini(&imem->base, suspend);
>> +}
>> +
>> +static int
>> +gk20a_instmem_init(struct nouveau_object *object)
>> +{
>> + struct nouveau_instmem *imem = (void *)object;
>> +
>> + return nouveau_subdev_init(&imem->base);
>> +}
> I think the style is to just link those up directly in the class
> definition when they're trivial like that, i.e. point them at
> _nouveau_subdev_init and such.
Thanks for the hint. Will fix. :)

>
>> +
>> +static int
>> +gk20a_instmem_ctor(struct nouveau_object *parent, struct nouveau_object *engine,
>> + struct nouveau_oclass *oclass, void *data, u32 size,
>> + struct nouveau_object **pobject)
>> +{
>> + struct nv50_instmem_priv *priv;
>> + int ret;
>> +
>> + ret = nouveau_instmem_create(parent, engine, oclass, &priv);
>> + *pobject = nv_object(priv);
>> + if (ret)
>> + return ret;
>> +
>> + spin_lock_init(&priv->lock);
>> + return 0;
>> +}
>> +
>> +struct nouveau_oclass *
>> +gk20a_instmem_oclass = &(struct nouveau_instmem_impl) {
>> + .base.handle = NV_SUBDEV(INSTMEM, 0x50),
>> + .base.ofuncs = &(struct nouveau_ofuncs) {
>> + .ctor = gk20a_instmem_ctor,
>> + .dtor = _nouveau_instmem_dtor,
>> + .init = gk20a_instmem_init,
>> + .fini = gk20a_instmem_fini,
>> + },
>> + .instobj = &nv50_instobj_oclass.base,
>> +}.base;
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Nouveau mailing list
>> [email protected]
>> http://lists.freedesktop.org/mailman/listinfo/nouveau

2014-12-24 13:23:12

by Lucas Stach

[permalink] [raw]
Subject: Re: [PATCH nouveau 06/11] platform: complete the power up/down sequence

Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu:
> This patch adds some missing pieces of the rail gaing/ungating sequence that
> can improve the stability in theory.
>
> Signed-off-by: Vince Hsu <[email protected]>
> ---
> drm/nouveau_platform.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> drm/nouveau_platform.h | 3 +++
> 2 files changed, 45 insertions(+)
>
> diff --git a/drm/nouveau_platform.c b/drm/nouveau_platform.c
> index 68788b17a45c..527fe2358fc9 100644
> --- a/drm/nouveau_platform.c
> +++ b/drm/nouveau_platform.c
> @@ -25,9 +25,11 @@
> #include <linux/module.h>
> #include <linux/platform_device.h>
> #include <linux/of.h>
> +#include <linux/of_platform.h>
> #include <linux/reset.h>
> #include <linux/regulator/consumer.h>
> #include <soc/tegra/fuse.h>
> +#include <soc/tegra/mc.h>
> #include <soc/tegra/pmc.h>
>
> #include "nouveau_drm.h"
> @@ -61,6 +63,9 @@ static int nouveau_platform_power_up(struct nouveau_platform_gpu *gpu)
> reset_control_deassert(gpu->rst);
> udelay(10);
>
> + tegra_mc_flush(gpu->mc, gpu->swgroup, false);
> + udelay(10);
> +
> return 0;
>
> err_clamp:
> @@ -77,6 +82,14 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
> {
> int err;
>
> + tegra_mc_flush(gpu->mc, gpu->swgroup, true);
> + udelay(10);
> +
> + err = tegra_powergate_gpu_set_clamping(true);
> + if (err)
> + return err;
> + udelay(10);
> +
> reset_control_assert(gpu->rst);
> udelay(10);
>
> @@ -91,6 +104,31 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
> return 0;
> }
>
> +static int nouveau_platform_get_mc(struct device *dev,
> + struct tegra_mc **mc, unsigned int *swgroup)

Uhm, no. If this is needed this has to be a Tegra MC function and not
burried into nouveau code. You are using knowledge about the internal
workings of the MC driver here.

Also this should probably only take the Dt node pointer as argument and
return a something like a tegra_mc_client struct that contains both the
MC device pointer and the swgroup so you can pass that to
tegra_mc_flush().

> +{
> + struct of_phandle_args args;
> + struct platform_device *pdev;
> + int ret;
> +
> + ret = of_parse_phandle_with_fixed_args(dev->of_node, "mc",
> + 1, 0, &args);
> + if (ret)
> + return ret;
> +
> + pdev = of_find_device_by_node(args.np);
> + if (!pdev)
> + return -EINVAL;

This is wrong, you need to handle -EPROBE_DEFER here.

> +
> + *mc = platform_get_drvdata(pdev);
> + if (!*mc)
> + return -EINVAL;
> +
> + *swgroup = args.args[0];
> +
> + return 0;
> +}
> +
> static int nouveau_platform_probe(struct platform_device *pdev)
> {
> struct nouveau_platform_gpu *gpu;
> @@ -118,6 +156,10 @@ static int nouveau_platform_probe(struct platform_device *pdev)
> if (IS_ERR(gpu->clk_pwr))
> return PTR_ERR(gpu->clk_pwr);
>
> + err = nouveau_platform_get_mc(&pdev->dev, &gpu->mc, &gpu->swgroup);
> + if (err)
> + return err;
> +
> err = nouveau_platform_power_up(gpu);
> if (err)
> return err;
> diff --git a/drm/nouveau_platform.h b/drm/nouveau_platform.h
> index 58c28b5653d5..969dbddd786d 100644
> --- a/drm/nouveau_platform.h
> +++ b/drm/nouveau_platform.h
> @@ -35,6 +35,9 @@ struct nouveau_platform_gpu {
> struct clk *clk_pwr;
>
> struct regulator *vdd;
> +
> + struct tegra_mc *mc;
> + unsigned int swgroup;
> };
>
> struct nouveau_platform_device {

2014-12-24 13:26:14

by Lucas Stach

[permalink] [raw]
Subject: Re: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu:
> The Tegra124 and later Tegra SoCs have a sepatate rail gating register
> to enable/disable the clamp. The original function
> tegra_powergate_remove_clamping() is not sufficient for the enable
> function. So add a new function which is dedicated to the GPU rail
> gating. Also don't refer to the powergate ID since the GPU ID makes no
> sense here.
>
> Signed-off-by: Vince Hsu <[email protected]>

To be honest I don't see the point of this patch.
You are bloating the PMC interface by introducing another exported
function that does nothing different than what the current function
already does.

If you need a way to assert the clamp I would have expected you to
introduce a common function to do this for all power partitions.

Other comments inline.

Regards,
Lucas

> ---
> drivers/soc/tegra/pmc.c | 34 +++++++++++++++++++++++-----------
> include/soc/tegra/pmc.h | 2 ++
> 2 files changed, 25 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
> index a2c0ceb95f8f..7798c530ead1 100644
> --- a/drivers/soc/tegra/pmc.c
> +++ b/drivers/soc/tegra/pmc.c
> @@ -225,17 +225,6 @@ int tegra_powergate_remove_clamping(int id)
> return -EINVAL;
>
> /*
> - * The Tegra124 GPU has a separate register (with different semantics)
> - * to remove clamps.
> - */
> - if (tegra_get_chip_id() == TEGRA124) {
> - if (id == TEGRA_POWERGATE_3D) {
> - tegra_pmc_writel(0, GPU_RG_CNTRL);
> - return 0;
> - }
> - }
> -
> - /*
> * Tegra 2 has a bug where PCIE and VDE clamping masks are
> * swapped relatively to the partition ids
> */
> @@ -253,6 +242,29 @@ int tegra_powergate_remove_clamping(int id)
> EXPORT_SYMBOL(tegra_powergate_remove_clamping);
>
> /**
> + * tegra_powergate_gpu_set_clamping - control GPU-SOC clamps
> + *
> + * The post-Tegra114 chips have a separate rail gating register to configure
> + * clamps.
> + *
> + * @assert: true to assert clamp, and false to remove
> + */
> +int tegra_powergate_gpu_set_clamping(bool assert)

Those functions with a bool parameter to set/unset something are really
annoying. Please avoid this pattern. The naming of the original function
is much more intuitive.

> +{
> + if (!pmc->soc)
> + return -EINVAL;
> +
> + if (tegra_get_chip_id() == TEGRA124) {
> + tegra_pmc_writel(assert ? 1 : 0, GPU_RG_CNTRL);
> + tegra_pmc_readl(GPU_RG_CNTRL);

You are reading the register back here, which to me seems like you are
trying to make sure that the write is flushed. Why is this needed?
I also observed the need to do this while working on Tegra124 PCIe in
Barebox, otherwise the partition wouldn't power up. I didn't have time
to follow up on this yet, so it would be nice if you could explain why
this is needed, or if you don't know ask HW about it.

> + return 0;
> + }
> +
> + return -EINVAL;
> +}
> +EXPORT_SYMBOL(tegra_powergate_gpu_set_clamping);
> +
> +/**
> * tegra_powergate_sequence_power_up() - power up partition
> * @id: partition ID
> * @clk: clock for partition
> diff --git a/include/soc/tegra/pmc.h b/include/soc/tegra/pmc.h
> index 65a93273e72f..53d620525a9e 100644
> --- a/include/soc/tegra/pmc.h
> +++ b/include/soc/tegra/pmc.h
> @@ -109,6 +109,8 @@ int tegra_powergate_is_powered(int id);
> int tegra_powergate_power_on(int id);
> int tegra_powergate_power_off(int id);
> int tegra_powergate_remove_clamping(int id);
> +/* Only for Tegra124 and later */
> +int tegra_powergate_gpu_set_clamping(bool assert);
>
> /* Must be called with clk disabled, and returns with clk enabled */
> int tegra_powergate_sequence_power_up(int id, struct clk *clk,

2014-12-24 13:53:16

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

I think "ARM: tegra:" is wrong prefix for this patch and "soc: tegra:" should be
used instead to show that it belongs to SoC driver, not arch code.

--
Dmitry

2014-12-25 02:05:57

by Vince Hsu

[permalink] [raw]
Subject: Re: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

On 12/24/2014 09:52 PM, Dmitry Osipenko wrote:
> I think "ARM: tegra:" is wrong prefix for this patch and "soc: tegra:" should be
> used instead to show that it belongs to SoC driver, not arch code.
Indeed. Will fix in v2.

Thanks,
Vince

2014-12-25 02:28:29

by Vince Hsu

[permalink] [raw]
Subject: Re: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

On 12/24/2014 09:16 PM, Lucas Stach wrote:
> Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu:
>> The Tegra124 and later Tegra SoCs have a sepatate rail gating register
>> to enable/disable the clamp. The original function
>> tegra_powergate_remove_clamping() is not sufficient for the enable
>> function. So add a new function which is dedicated to the GPU rail
>> gating. Also don't refer to the powergate ID since the GPU ID makes no
>> sense here.
>>
>> Signed-off-by: Vince Hsu <[email protected]>
> To be honest I don't see the point of this patch.
> You are bloating the PMC interface by introducing another exported
> function that does nothing different than what the current function
> already does.
>
> If you need a way to assert the clamp I would have expected you to
> introduce a common function to do this for all power partitions.
I thought about adding an tegra_powergate_assert_clamping(), but that
doesn't make sense to all the power partitions except GPU. Note the
difference in TRM. Any suggestion for the common function?
>
> Other comments inline.
>
> Regards,
> Lucas
>
>> ---
>> drivers/soc/tegra/pmc.c | 34 +++++++++++++++++++++++-----------
>> include/soc/tegra/pmc.h | 2 ++
>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
>> index a2c0ceb95f8f..7798c530ead1 100644
>> --- a/drivers/soc/tegra/pmc.c
>> +++ b/drivers/soc/tegra/pmc.c
>> @@ -225,17 +225,6 @@ int tegra_powergate_remove_clamping(int id)
>> return -EINVAL;
>>
>> /*
>> - * The Tegra124 GPU has a separate register (with different semantics)
>> - * to remove clamps.
>> - */
>> - if (tegra_get_chip_id() == TEGRA124) {
>> - if (id == TEGRA_POWERGATE_3D) {
>> - tegra_pmc_writel(0, GPU_RG_CNTRL);
>> - return 0;
>> - }
>> - }
>> -
>> - /*
>> * Tegra 2 has a bug where PCIE and VDE clamping masks are
>> * swapped relatively to the partition ids
>> */
>> @@ -253,6 +242,29 @@ int tegra_powergate_remove_clamping(int id)
>> EXPORT_SYMBOL(tegra_powergate_remove_clamping);
>>
>> /**
>> + * tegra_powergate_gpu_set_clamping - control GPU-SOC clamps
>> + *
>> + * The post-Tegra114 chips have a separate rail gating register to configure
>> + * clamps.
>> + *
>> + * @assert: true to assert clamp, and false to remove
>> + */
>> +int tegra_powergate_gpu_set_clamping(bool assert)
> Those functions with a bool parameter to set/unset something are really
> annoying. Please avoid this pattern. The naming of the original function
> is much more intuitive.
But the original function is not sufficient. Maybe add a
tegra_powergate_assert_gpu_clamping()? That way I will prefer to adding
one more removal function for GPU. And then again that's a bloat, too.
>
>> +{
>> + if (!pmc->soc)
>> + return -EINVAL;
>> +
>> + if (tegra_get_chip_id() == TEGRA124) {
>> + tegra_pmc_writel(assert ? 1 : 0, GPU_RG_CNTRL);
>> + tegra_pmc_readl(GPU_RG_CNTRL);
> You are reading the register back here, which to me seems like you are
> trying to make sure that the write is flushed. Why is this needed?
> I also observed the need to do this while working on Tegra124 PCIe in
> Barebox, otherwise the partition wouldn't power up. I didn't have time
> to follow up on this yet, so it would be nice if you could explain why
> this is needed, or if you don't know ask HW about it.
That's a read fence to assure the post of the previous writes through
Tegra interconnect. (copy-paster from
https://android.googlesource.com/kernel/tegra.git/+/28b107dcb3aa122de8e94e48af548140d519298f)
>
>> + return 0;
>> + }
>> +
>> + return -EINVAL;
>> +}
>> +EXPORT_SYMBOL(tegra_powergate_gpu_set_clamping);
>> +
>> +/**
>> * tegra_powergate_sequence_power_up() - power up partition
>> * @id: partition ID
>> * @clk: clock for partition
>> diff --git a/include/soc/tegra/pmc.h b/include/soc/tegra/pmc.h
>> index 65a93273e72f..53d620525a9e 100644
>> --- a/include/soc/tegra/pmc.h
>> +++ b/include/soc/tegra/pmc.h
>> @@ -109,6 +109,8 @@ int tegra_powergate_is_powered(int id);
>> int tegra_powergate_power_on(int id);
>> int tegra_powergate_power_off(int id);
>> int tegra_powergate_remove_clamping(int id);
>> +/* Only for Tegra124 and later */
>> +int tegra_powergate_gpu_set_clamping(bool assert);
>>
>> /* Must be called with clk disabled, and returns with clk enabled */
>> int tegra_powergate_sequence_power_up(int id, struct clk *clk,
>

2014-12-25 02:42:55

by Vince Hsu

[permalink] [raw]
Subject: Re: [PATCH nouveau 06/11] platform: complete the power up/down sequence


On 12/24/2014 09:23 PM, Lucas Stach wrote:
> Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu:
>> This patch adds some missing pieces of the rail gaing/ungating sequence that
>> can improve the stability in theory.
>>
>> Signed-off-by: Vince Hsu <[email protected]>
>> ---
>> drm/nouveau_platform.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>> drm/nouveau_platform.h | 3 +++
>> 2 files changed, 45 insertions(+)
>>
>> diff --git a/drm/nouveau_platform.c b/drm/nouveau_platform.c
>> index 68788b17a45c..527fe2358fc9 100644
>> --- a/drm/nouveau_platform.c
>> +++ b/drm/nouveau_platform.c
>> @@ -25,9 +25,11 @@
>> #include <linux/module.h>
>> #include <linux/platform_device.h>
>> #include <linux/of.h>
>> +#include <linux/of_platform.h>
>> #include <linux/reset.h>
>> #include <linux/regulator/consumer.h>
>> #include <soc/tegra/fuse.h>
>> +#include <soc/tegra/mc.h>
>> #include <soc/tegra/pmc.h>
>>
>> #include "nouveau_drm.h"
>> @@ -61,6 +63,9 @@ static int nouveau_platform_power_up(struct nouveau_platform_gpu *gpu)
>> reset_control_deassert(gpu->rst);
>> udelay(10);
>>
>> + tegra_mc_flush(gpu->mc, gpu->swgroup, false);
>> + udelay(10);
>> +
>> return 0;
>>
>> err_clamp:
>> @@ -77,6 +82,14 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
>> {
>> int err;
>>
>> + tegra_mc_flush(gpu->mc, gpu->swgroup, true);
>> + udelay(10);
>> +
>> + err = tegra_powergate_gpu_set_clamping(true);
>> + if (err)
>> + return err;
>> + udelay(10);
>> +
>> reset_control_assert(gpu->rst);
>> udelay(10);
>>
>> @@ -91,6 +104,31 @@ static int nouveau_platform_power_down(struct nouveau_platform_gpu *gpu)
>> return 0;
>> }
>>
>> +static int nouveau_platform_get_mc(struct device *dev,
>> + struct tegra_mc **mc, unsigned int *swgroup)
> Uhm, no. If this is needed this has to be a Tegra MC function and not
> burried into nouveau code. You are using knowledge about the internal
> workings of the MC driver here.
>
> Also this should probably only take the Dt node pointer as argument and
> return a something like a tegra_mc_client struct that contains both the
> MC device pointer and the swgroup so you can pass that to
> tegra_mc_flush().
Good idea. I will have something as below in V2 if there is no other
comments for this.

tegra_mc_client *tegra_mc_find_client(struct device_node *node)
{
...
ret = of_parse_phandle_with_args(node, "nvidia,memory-client", ...)
...
}

There were some discussion about this few weeks ago. I'm not sure
whether we have some conclusion/implementation though. Thierry?

http://lists.infradead.org/pipermail/linux-arm-kernel/2014-December/308703.html

>
>> +{
>> + struct of_phandle_args args;
>> + struct platform_device *pdev;
>> + int ret;
>> +
>> + ret = of_parse_phandle_with_fixed_args(dev->of_node, "mc",
>> + 1, 0, &args);
>> + if (ret)
>> + return ret;
>> +
>> + pdev = of_find_device_by_node(args.np);
>> + if (!pdev)
>> + return -EINVAL;
> This is wrong, you need to handle -EPROBE_DEFER here.
Indeed. Will fix.
>
>> +
>> + *mc = platform_get_drvdata(pdev);
>> + if (!*mc)
>> + return -EINVAL;
>> +
>> + *swgroup = args.args[0];
>> +
>> + return 0;
>> +}
>> +
>> static int nouveau_platform_probe(struct platform_device *pdev)
>> {
>> struct nouveau_platform_gpu *gpu;
>> @@ -118,6 +156,10 @@ static int nouveau_platform_probe(struct platform_device *pdev)
>> if (IS_ERR(gpu->clk_pwr))
>> return PTR_ERR(gpu->clk_pwr);
>>
>> + err = nouveau_platform_get_mc(&pdev->dev, &gpu->mc, &gpu->swgroup);
>> + if (err)
>> + return err;
>> +
>> err = nouveau_platform_power_up(gpu);
>> if (err)
>> return err;
>> diff --git a/drm/nouveau_platform.h b/drm/nouveau_platform.h
>> index 58c28b5653d5..969dbddd786d 100644
>> --- a/drm/nouveau_platform.h
>> +++ b/drm/nouveau_platform.h
>> @@ -35,6 +35,9 @@ struct nouveau_platform_gpu {
>> struct clk *clk_pwr;
>>
>> struct regulator *vdd;
>> +
>> + struct tegra_mc *mc;
>> + unsigned int swgroup;
>> };
>>
>> struct nouveau_platform_device {
>

2014-12-25 20:34:53

by Lucas Stach

[permalink] [raw]
Subject: Re: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

Am Donnerstag, den 25.12.2014, 10:28 +0800 schrieb Vince Hsu:
> On 12/24/2014 09:16 PM, Lucas Stach wrote:
> > Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu:
> >> The Tegra124 and later Tegra SoCs have a sepatate rail gating register
> >> to enable/disable the clamp. The original function
> >> tegra_powergate_remove_clamping() is not sufficient for the enable
> >> function. So add a new function which is dedicated to the GPU rail
> >> gating. Also don't refer to the powergate ID since the GPU ID makes no
> >> sense here.
> >>
> >> Signed-off-by: Vince Hsu <[email protected]>
> > To be honest I don't see the point of this patch.
> > You are bloating the PMC interface by introducing another exported
> > function that does nothing different than what the current function
> > already does.
> >
> > If you need a way to assert the clamp I would have expected you to
> > introduce a common function to do this for all power partitions.
> I thought about adding an tegra_powergate_assert_clamping(), but that
> doesn't make sense to all the power partitions except GPU. Note the
> difference in TRM. Any suggestion for the common function?

Is there anything speaking against adding this function and only accept
the GPU partition as valid parameter for now. So at least the interface
stays symmetric and can be easily extended if any future partitions have
similar characteristics as the GPU one.

> >
> > Other comments inline.
> >
> > Regards,
> > Lucas
> >
> >> ---
> >> drivers/soc/tegra/pmc.c | 34 +++++++++++++++++++++++-----------
> >> include/soc/tegra/pmc.h | 2 ++
> >> 2 files changed, 25 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
> >> index a2c0ceb95f8f..7798c530ead1 100644
> >> --- a/drivers/soc/tegra/pmc.c
> >> +++ b/drivers/soc/tegra/pmc.c
> >> @@ -225,17 +225,6 @@ int tegra_powergate_remove_clamping(int id)
> >> return -EINVAL;
> >>
> >> /*
> >> - * The Tegra124 GPU has a separate register (with different semantics)
> >> - * to remove clamps.
> >> - */
> >> - if (tegra_get_chip_id() == TEGRA124) {
> >> - if (id == TEGRA_POWERGATE_3D) {
> >> - tegra_pmc_writel(0, GPU_RG_CNTRL);
> >> - return 0;
> >> - }
> >> - }
> >> -
> >> - /*
> >> * Tegra 2 has a bug where PCIE and VDE clamping masks are
> >> * swapped relatively to the partition ids
> >> */
> >> @@ -253,6 +242,29 @@ int tegra_powergate_remove_clamping(int id)
> >> EXPORT_SYMBOL(tegra_powergate_remove_clamping);
> >>
> >> /**
> >> + * tegra_powergate_gpu_set_clamping - control GPU-SOC clamps
> >> + *
> >> + * The post-Tegra114 chips have a separate rail gating register to configure
> >> + * clamps.
> >> + *
> >> + * @assert: true to assert clamp, and false to remove
> >> + */
> >> +int tegra_powergate_gpu_set_clamping(bool assert)
> > Those functions with a bool parameter to set/unset something are really
> > annoying. Please avoid this pattern. The naming of the original function
> > is much more intuitive.
> But the original function is not sufficient. Maybe add a
> tegra_powergate_assert_gpu_clamping()? That way I will prefer to adding
> one more removal function for GPU. And then again that's a bloat, too.
> >
> >> +{
> >> + if (!pmc->soc)
> >> + return -EINVAL;
> >> +
> >> + if (tegra_get_chip_id() == TEGRA124) {
> >> + tegra_pmc_writel(assert ? 1 : 0, GPU_RG_CNTRL);
> >> + tegra_pmc_readl(GPU_RG_CNTRL);
> > You are reading the register back here, which to me seems like you are
> > trying to make sure that the write is flushed. Why is this needed?
> > I also observed the need to do this while working on Tegra124 PCIe in
> > Barebox, otherwise the partition wouldn't power up. I didn't have time
> > to follow up on this yet, so it would be nice if you could explain why
> > this is needed, or if you don't know ask HW about it.
> That's a read fence to assure the post of the previous writes through
> Tegra interconnect. (copy-paster from
> https://android.googlesource.com/kernel/tegra.git/+/28b107dcb3aa122de8e94e48af548140d519298f)

I see what it does, the question is more about why this is needed.
What is the Tegra interconnect? According to the TRM the Tegra contains
some standard AXI <-> AHB <-> APB bridges. That a read is needed to
assure the write is posted to the APB bus seems to imply that there is
some write buffering in one of those bridges. Can we get this documented
somewhere?

And isn't it needed for the other partition ungating function too then?

Regards,
Lucas

2014-12-29 02:49:23

by Vince Hsu

[permalink] [raw]
Subject: Re: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp


On 12/26/2014 04:34 AM, Lucas Stach wrote:
> Am Donnerstag, den 25.12.2014, 10:28 +0800 schrieb Vince Hsu:
>> On 12/24/2014 09:16 PM, Lucas Stach wrote:
>>> Am Dienstag, den 23.12.2014, 18:39 +0800 schrieb Vince Hsu:
>>>> The Tegra124 and later Tegra SoCs have a sepatate rail gating register
>>>> to enable/disable the clamp. The original function
>>>> tegra_powergate_remove_clamping() is not sufficient for the enable
>>>> function. So add a new function which is dedicated to the GPU rail
>>>> gating. Also don't refer to the powergate ID since the GPU ID makes no
>>>> sense here.
>>>>
>>>> Signed-off-by: Vince Hsu <[email protected]>
>>> To be honest I don't see the point of this patch.
>>> You are bloating the PMC interface by introducing another exported
>>> function that does nothing different than what the current function
>>> already does.
>>>
>>> If you need a way to assert the clamp I would have expected you to
>>> introduce a common function to do this for all power partitions.
>> I thought about adding an tegra_powergate_assert_clamping(), but that
>> doesn't make sense to all the power partitions except GPU. Note the
>> difference in TRM. Any suggestion for the common function?
> Is there anything speaking against adding this function and only accept
> the GPU partition as valid parameter for now. So at least the interface
> stays symmetric and can be easily extended if any future partitions have
> similar characteristics as the GPU one.
The register APBDEV_PMC_GPU_RG_CNTRL_0 is only for GPU and can be used
for assertion and deassertion. The APBDEV_PMC_REMOVE_CLAMPING_CMD_0 is
only used for deassertion. If we have any future partitions that can be
asserted by SW like GPU, we can improve the interface then.

>
>>> Other comments inline.
>>>
>>> Regards,
>>> Lucas
>>>
>>>> ---
>>>> drivers/soc/tegra/pmc.c | 34 +++++++++++++++++++++++-----------
>>>> include/soc/tegra/pmc.h | 2 ++
>>>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
>>>> index a2c0ceb95f8f..7798c530ead1 100644
>>>> --- a/drivers/soc/tegra/pmc.c
>>>> +++ b/drivers/soc/tegra/pmc.c
>>>> @@ -225,17 +225,6 @@ int tegra_powergate_remove_clamping(int id)
>>>> return -EINVAL;
>>>>
>>>> /*
>>>> - * The Tegra124 GPU has a separate register (with different semantics)
>>>> - * to remove clamps.
>>>> - */
>>>> - if (tegra_get_chip_id() == TEGRA124) {
>>>> - if (id == TEGRA_POWERGATE_3D) {
>>>> - tegra_pmc_writel(0, GPU_RG_CNTRL);
>>>> - return 0;
>>>> - }
>>>> - }
>>>> -
>>>> - /*
>>>> * Tegra 2 has a bug where PCIE and VDE clamping masks are
>>>> * swapped relatively to the partition ids
>>>> */
>>>> @@ -253,6 +242,29 @@ int tegra_powergate_remove_clamping(int id)
>>>> EXPORT_SYMBOL(tegra_powergate_remove_clamping);
>>>>
>>>> /**
>>>> + * tegra_powergate_gpu_set_clamping - control GPU-SOC clamps
>>>> + *
>>>> + * The post-Tegra114 chips have a separate rail gating register to configure
>>>> + * clamps.
>>>> + *
>>>> + * @assert: true to assert clamp, and false to remove
>>>> + */
>>>> +int tegra_powergate_gpu_set_clamping(bool assert)
>>> Those functions with a bool parameter to set/unset something are really
>>> annoying. Please avoid this pattern. The naming of the original function
>>> is much more intuitive.
>> But the original function is not sufficient. Maybe add a
>> tegra_powergate_assert_gpu_clamping()? That way I will prefer to adding
>> one more removal function for GPU. And then again that's a bloat, too.
>>>> +{
>>>> + if (!pmc->soc)
>>>> + return -EINVAL;
>>>> +
>>>> + if (tegra_get_chip_id() == TEGRA124) {
>>>> + tegra_pmc_writel(assert ? 1 : 0, GPU_RG_CNTRL);
>>>> + tegra_pmc_readl(GPU_RG_CNTRL);
>>> You are reading the register back here, which to me seems like you are
>>> trying to make sure that the write is flushed. Why is this needed?
>>> I also observed the need to do this while working on Tegra124 PCIe in
>>> Barebox, otherwise the partition wouldn't power up. I didn't have time
>>> to follow up on this yet, so it would be nice if you could explain why
>>> this is needed, or if you don't know ask HW about it.
>> That's a read fence to assure the post of the previous writes through
>> Tegra interconnect. (copy-paster from
>> https://android.googlesource.com/kernel/tegra.git/+/28b107dcb3aa122de8e94e48af548140d519298f)
> I see what it does, the question is more about why this is needed.
> What is the Tegra interconnect? According to the TRM the Tegra contains
> some standard AXI <-> AHB <-> APB bridges. That a read is needed to
> assure the write is posted to the APB bus seems to imply that there is
> some write buffering in one of those bridges. Can we get this documented
> somewhere?
The TRM does mention a read after the write. Check the section 32.2.2.3.

Thanks,
Vince

>
> And isn't it needed for the other partition ungating function too then?
I believe yes.

>
> Regards,
> Lucas
>
>

2014-12-30 02:39:35

by Emil Velikov

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH nouveau 09/11] drm: export some variable and functions to resue the PM functions

On 23/12/14 10:40, Vince Hsu wrote:
> This patch adds some checks in the suspend/resume functions to distinguish
> the dGPU and mobile GPU and exports some variables/functions so that the
> nouveau platform device can reuse them.
>
Hi Vince,

Afaiu one needs to export a symbol as it's used by another module or
subsystem. With the follow up two patches you are not doing either one,
so I'd assume that you can just omit the EXPORT_* changes.

I could be wrong though :-)

Cheers
Emil


> Signed-off-by: Vince Hsu <[email protected]>
> ---
> drm/nouveau_drm.c | 16 +++++++++++-----
> drm/nouveau_drm.h | 2 ++
> 2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/drm/nouveau_drm.c b/drm/nouveau_drm.c
> index afb93bb72f97..0ed99ef80211 100644
> --- a/drm/nouveau_drm.c
> +++ b/drm/nouveau_drm.c
> @@ -72,6 +72,7 @@ module_param_named(modeset, nouveau_modeset, int, 0400);
>
> MODULE_PARM_DESC(runpm, "disable (0), force enable (1), optimus only default (-1)");
> int nouveau_runtime_pm = -1;
> +EXPORT_SYMBOL(nouveau_runtime_pm);
> module_param_named(runpm, nouveau_runtime_pm, int, 0400);
>
> static struct drm_driver driver_stub;
> @@ -543,7 +544,7 @@ nouveau_drm_remove(struct pci_dev *pdev)
> nouveau_drm_device_remove(dev);
> }
>
> -static int
> +int
> nouveau_do_suspend(struct drm_device *dev, bool runtime)
> {
> struct nouveau_drm *drm = nouveau_drm(dev);
> @@ -559,8 +560,10 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
> return ret;
> }
>
> - NV_INFO(drm, "evicting buffers...\n");
> - ttm_bo_evict_mm(&drm->ttm.bdev, TTM_PL_VRAM);
> + if (dev->pdev) {
> + NV_INFO(drm, "evicting buffers...\n");
> + ttm_bo_evict_mm(&drm->ttm.bdev, TTM_PL_VRAM);
> + }
>
> NV_INFO(drm, "waiting for kernel channels to go idle...\n");
> if (drm->cechan) {
> @@ -612,8 +615,9 @@ fail_display:
> }
> return ret;
> }
> +EXPORT_SYMBOL(nouveau_do_suspend);
>
> -static int
> +int
> nouveau_do_resume(struct drm_device *dev, bool runtime)
> {
> struct nouveau_drm *drm = nouveau_drm(dev);
> @@ -635,7 +639,8 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
> nvif_client_resume(&cli->base);
> }
>
> - nouveau_run_vbios_init(dev);
> + if (dev->pdev)
> + nouveau_run_vbios_init(dev);
>
> if (dev->mode_config.num_crtc) {
> NV_INFO(drm, "resuming display...\n");
> @@ -646,6 +651,7 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
>
> return 0;
> }
> +EXPORT_SYMBOL(nouveau_do_resume);
>
> int
> nouveau_pmops_suspend(struct device *dev)
> diff --git a/drm/nouveau_drm.h b/drm/nouveau_drm.h
> index 8ae36f265fb8..897d295dd1e3 100644
> --- a/drm/nouveau_drm.h
> +++ b/drm/nouveau_drm.h
> @@ -177,6 +177,8 @@ nouveau_drm(struct drm_device *dev)
>
> int nouveau_pmops_suspend(struct device *);
> int nouveau_pmops_resume(struct device *);
> +int nouveau_do_suspend(struct drm_device *dev, bool runtime);
> +int nouveau_do_resume(struct drm_device *dev, bool runtime);
>
> #define nouveau_platform_device_create(p, u) \
> nouveau_platform_device_create_(p, sizeof(**u), (void **)u)
>

2014-12-30 03:18:18

by Vince Hsu

[permalink] [raw]
Subject: Re: [Nouveau] [PATCH nouveau 09/11] drm: export some variable and functions to resue the PM functions

Hi Emil,

On 12/30/2014 10:34 AM, Emil Velikov wrote:
> On 23/12/14 10:40, Vince Hsu wrote:
>> This patch adds some checks in the suspend/resume functions to distinguish
>> the dGPU and mobile GPU and exports some variables/functions so that the
>> nouveau platform device can reuse them.
>>
> Hi Vince,
>
> Afaiu one needs to export a symbol as it's used by another module or
> subsystem. With the follow up two patches you are not doing either one,
> so I'd assume that you can just omit the EXPORT_* changes.
The nouveau platform device driver is built as another module -
nouveau_platform.ko. :)

Thanks,
Vince

>
> I could be wrong though :-)
>
> Cheers
> Emil
>
>
>> Signed-off-by: Vince Hsu <[email protected]>
>> ---
>> drm/nouveau_drm.c | 16 +++++++++++-----
>> drm/nouveau_drm.h | 2 ++
>> 2 files changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/drm/nouveau_drm.c b/drm/nouveau_drm.c
>> index afb93bb72f97..0ed99ef80211 100644
>> --- a/drm/nouveau_drm.c
>> +++ b/drm/nouveau_drm.c
>> @@ -72,6 +72,7 @@ module_param_named(modeset, nouveau_modeset, int, 0400);
>>
>> MODULE_PARM_DESC(runpm, "disable (0), force enable (1), optimus only default (-1)");
>> int nouveau_runtime_pm = -1;
>> +EXPORT_SYMBOL(nouveau_runtime_pm);
>> module_param_named(runpm, nouveau_runtime_pm, int, 0400);
>>
>> static struct drm_driver driver_stub;
>> @@ -543,7 +544,7 @@ nouveau_drm_remove(struct pci_dev *pdev)
>> nouveau_drm_device_remove(dev);
>> }
>>
>> -static int
>> +int
>> nouveau_do_suspend(struct drm_device *dev, bool runtime)
>> {
>> struct nouveau_drm *drm = nouveau_drm(dev);
>> @@ -559,8 +560,10 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
>> return ret;
>> }
>>
>> - NV_INFO(drm, "evicting buffers...\n");
>> - ttm_bo_evict_mm(&drm->ttm.bdev, TTM_PL_VRAM);
>> + if (dev->pdev) {
>> + NV_INFO(drm, "evicting buffers...\n");
>> + ttm_bo_evict_mm(&drm->ttm.bdev, TTM_PL_VRAM);
>> + }
>>
>> NV_INFO(drm, "waiting for kernel channels to go idle...\n");
>> if (drm->cechan) {
>> @@ -612,8 +615,9 @@ fail_display:
>> }
>> return ret;
>> }
>> +EXPORT_SYMBOL(nouveau_do_suspend);
>>
>> -static int
>> +int
>> nouveau_do_resume(struct drm_device *dev, bool runtime)
>> {
>> struct nouveau_drm *drm = nouveau_drm(dev);
>> @@ -635,7 +639,8 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
>> nvif_client_resume(&cli->base);
>> }
>>
>> - nouveau_run_vbios_init(dev);
>> + if (dev->pdev)
>> + nouveau_run_vbios_init(dev);
>>
>> if (dev->mode_config.num_crtc) {
>> NV_INFO(drm, "resuming display...\n");
>> @@ -646,6 +651,7 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
>>
>> return 0;
>> }
>> +EXPORT_SYMBOL(nouveau_do_resume);
>>
>> int
>> nouveau_pmops_suspend(struct device *dev)
>> diff --git a/drm/nouveau_drm.h b/drm/nouveau_drm.h
>> index 8ae36f265fb8..897d295dd1e3 100644
>> --- a/drm/nouveau_drm.h
>> +++ b/drm/nouveau_drm.h
>> @@ -177,6 +177,8 @@ nouveau_drm(struct drm_device *dev)
>>
>> int nouveau_pmops_suspend(struct device *);
>> int nouveau_pmops_resume(struct device *);
>> +int nouveau_do_suspend(struct drm_device *dev, bool runtime);
>> +int nouveau_do_resume(struct drm_device *dev, bool runtime);
>>
>> #define nouveau_platform_device_create(p, u) \
>> nouveau_platform_device_create_(p, sizeof(**u), (void **)u)
>>

2014-12-30 16:42:38

by Lucas Stach

[permalink] [raw]
Subject: Re: [PATCH 1/11] ARM: tegra: add function to control the GPU rail clamp

Am Montag, den 29.12.2014, 10:49 +0800 schrieb Vince Hsu:

[...]

> >> That's a read fence to assure the post of the previous writes through
> >> Tegra interconnect. (copy-paster from
> >> https://android.googlesource.com/kernel/tegra.git/+/28b107dcb3aa122de8e94e48af548140d519298f)
> > I see what it does, the question is more about why this is needed.
> > What is the Tegra interconnect? According to the TRM the Tegra contains
> > some standard AXI <-> AHB <-> APB bridges. That a read is needed to
> > assure the write is posted to the APB bus seems to imply that there is
> > some write buffering in one of those bridges. Can we get this documented
> > somewhere?
> The TRM does mention a read after the write. Check the section 32.2.2.3.
>
Unfortunately this doesn't seem to be included in the public TRM. It
would be nice if this could be documented either in the next version of
the TRM or as a public Appnote.

Thanks,
Lucas