2020-11-15 21:38:07

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 00/17] Introduce memory interconnect for NVIDIA Tegra SoCs

This series brings initial support for memory interconnect to Tegra20,
Tegra30 and Tegra124 SoCs.

For the starter only display controllers and devfreq devices are getting
interconnect API support, others could be supported later on. The display
controllers have the biggest demand for interconnect API right now because
dynamic memory frequency scaling can't be done safely without taking into
account bandwidth requirement from the displays. In particular this series
fixes distorted display output on T30 Ouya and T124 TK1 devices.

Changelog:

v9: - Squashed "memory: tegra30-emc: Factor out clk initialization" into
patch "tegra30: Support interconnect framework".
Suggested by Krzysztof Kozlowski.

- Improved Kconfig in the patch "memory: tegra124-emc: Make driver modular"
by adding CONFIG_TEGRA124_CLK_EMC entry, which makes clk-driver changes
to look a bit more cleaner. Suggested by Krzysztof Kozlowski.

- Dropped voltage regulator support from ICC and DT patches for now
because there is a new discussion about using a power domain abstraction
for controlling the regulator, which is likely to happen.

- Replaced direct "operating-points-v2" property checking in EMC drivers
with checking of a returned error code from dev_pm_opp_of_add_table().
Note that I haven't touched T20 EMC driver because it's very likely
that we'll replace that code with a common helper soon anyways.
Suggested by Viresh Kumar.

- The T30 DT patches now include EMC OPP changes for Ouya board, which
is available now in linux-next.

Dmitry Osipenko (17):
memory: tegra30: Support interconnect framework
memory: tegra124-emc: Make driver modular
memory: tegra124-emc: Continue probing if timings are missing in
device-tree
memory: tegra124: Support interconnect framework
drm/tegra: dc: Support memory bandwidth management
drm/tegra: dc: Extend debug stats with total number of events
PM / devfreq: tegra30: Support interconnect and OPPs from device-tree
PM / devfreq: tegra30: Separate configurations per-SoC generation
PM / devfreq: tegra20: Deprecate in a favor of emc-stat based driver
ARM: tegra: Correct EMC registers size in Tegra20 device-tree
ARM: tegra: Add interconnect properties to Tegra20 device-tree
ARM: tegra: Add interconnect properties to Tegra30 device-tree
ARM: tegra: Add interconnect properties to Tegra124 device-tree
ARM: tegra: Add nvidia,memory-controller phandle to Tegra20 EMC
device-tree
ARM: tegra: Add EMC OPP properties to Tegra20 device-trees
ARM: tegra: Add EMC OPP and ICC properties to Tegra30 EMC and ACTMON
device-tree nodes
ARM: tegra: Add EMC OPP and ICC properties to Tegra124 EMC and ACTMON
device-tree nodes

MAINTAINERS | 1 -
arch/arm/boot/dts/tegra124-apalis-emc.dtsi | 8 +
.../arm/boot/dts/tegra124-jetson-tk1-emc.dtsi | 8 +
arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi | 10 +
.../arm/boot/dts/tegra124-nyan-blaze-emc.dtsi | 10 +
.../boot/dts/tegra124-peripherals-opp.dtsi | 419 ++++++++++++++++++
arch/arm/boot/dts/tegra124.dtsi | 31 ++
.../boot/dts/tegra20-acer-a500-picasso.dts | 5 +
arch/arm/boot/dts/tegra20-colibri.dtsi | 4 +
arch/arm/boot/dts/tegra20-paz00.dts | 4 +
.../arm/boot/dts/tegra20-peripherals-opp.dtsi | 92 ++++
arch/arm/boot/dts/tegra20.dtsi | 33 +-
...30-asus-nexus7-grouper-memory-timings.dtsi | 12 +
arch/arm/boot/dts/tegra30-ouya.dts | 8 +
.../arm/boot/dts/tegra30-peripherals-opp.dtsi | 383 ++++++++++++++++
arch/arm/boot/dts/tegra30.dtsi | 33 +-
drivers/clk/tegra/Kconfig | 3 +
drivers/clk/tegra/Makefile | 2 +-
drivers/clk/tegra/clk-tegra124-emc.c | 41 +-
drivers/clk/tegra/clk-tegra124.c | 26 +-
drivers/clk/tegra/clk.h | 18 +-
drivers/devfreq/Kconfig | 10 -
drivers/devfreq/Makefile | 1 -
drivers/devfreq/tegra20-devfreq.c | 210 ---------
drivers/devfreq/tegra30-devfreq.c | 154 ++++---
drivers/gpu/drm/tegra/Kconfig | 1 +
drivers/gpu/drm/tegra/dc.c | 359 +++++++++++++++
drivers/gpu/drm/tegra/dc.h | 19 +
drivers/gpu/drm/tegra/drm.c | 14 +
drivers/gpu/drm/tegra/hub.c | 3 +
drivers/gpu/drm/tegra/plane.c | 121 +++++
drivers/gpu/drm/tegra/plane.h | 15 +
drivers/memory/tegra/Kconfig | 5 +-
drivers/memory/tegra/tegra124-emc.c | 382 ++++++++++++++--
drivers/memory/tegra/tegra124.c | 82 +++-
drivers/memory/tegra/tegra30-emc.c | 349 ++++++++++++++-
drivers/memory/tegra/tegra30.c | 173 +++++++-
include/linux/clk/tegra.h | 8 +
include/soc/tegra/emc.h | 16 -
39 files changed, 2699 insertions(+), 374 deletions(-)
create mode 100644 arch/arm/boot/dts/tegra124-peripherals-opp.dtsi
create mode 100644 arch/arm/boot/dts/tegra20-peripherals-opp.dtsi
create mode 100644 arch/arm/boot/dts/tegra30-peripherals-opp.dtsi
delete mode 100644 drivers/devfreq/tegra20-devfreq.c
delete mode 100644 include/soc/tegra/emc.h

--
2.29.2


2020-11-15 21:38:30

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 04/17] memory: tegra124: Support interconnect framework

Now Internal and External memory controllers are memory interconnection
providers. This allows us to use interconnect API for tuning of memory
configuration. EMC driver now supports OPPs and DVFS.

Tested-by: Nicolas Chauvet <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/memory/tegra/Kconfig | 1 +
drivers/memory/tegra/tegra124-emc.c | 325 +++++++++++++++++++++++++++-
drivers/memory/tegra/tegra124.c | 82 ++++++-
3 files changed, 396 insertions(+), 12 deletions(-)

diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig
index f5b451403c58..a70967a56e52 100644
--- a/drivers/memory/tegra/Kconfig
+++ b/drivers/memory/tegra/Kconfig
@@ -36,6 +36,7 @@ config TEGRA124_EMC
default y
depends on TEGRA_MC && ARCH_TEGRA_124_SOC
select TEGRA124_CLK_EMC
+ select PM_OPP
help
This driver is for the External Memory Controller (EMC) found on
Tegra124 chips. The EMC controls the external DRAM on the board.
diff --git a/drivers/memory/tegra/tegra124-emc.c b/drivers/memory/tegra/tegra124-emc.c
index 8fb8c1af25c9..ed34ca982364 100644
--- a/drivers/memory/tegra/tegra124-emc.c
+++ b/drivers/memory/tegra/tegra124-emc.c
@@ -12,20 +12,26 @@
#include <linux/clk/tegra.h>
#include <linux/debugfs.h>
#include <linux/delay.h>
+#include <linux/interconnect-provider.h>
#include <linux/io.h>
#include <linux/module.h>
+#include <linux/mutex.h>
#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <linux/platform_device.h>
+#include <linux/pm_opp.h>
#include <linux/sort.h>
#include <linux/string.h>

#include <soc/tegra/fuse.h>
#include <soc/tegra/mc.h>

+#include "mc.h"
+
#define EMC_FBIO_CFG5 0x104
#define EMC_FBIO_CFG5_DRAM_TYPE_MASK 0x3
#define EMC_FBIO_CFG5_DRAM_TYPE_SHIFT 0
+#define EMC_FBIO_CFG5_DRAM_WIDTH_X64 BIT(4)

#define EMC_INTSTATUS 0x0
#define EMC_INTSTATUS_CLKCHANGE_COMPLETE BIT(4)
@@ -461,6 +467,17 @@ struct emc_timing {
u32 emc_zcal_interval;
};

+enum emc_rate_request_type {
+ EMC_RATE_DEBUG,
+ EMC_RATE_ICC,
+ EMC_RATE_TYPE_MAX,
+};
+
+struct emc_rate_request {
+ unsigned long min_rate;
+ unsigned long max_rate;
+};
+
struct tegra_emc {
struct device *dev;

@@ -471,6 +488,7 @@ struct tegra_emc {
struct clk *clk;

enum emc_dram_type dram_type;
+ unsigned int dram_bus_width;
unsigned int dram_num;

struct emc_timing last_timing;
@@ -482,6 +500,17 @@ struct tegra_emc {
unsigned long min_rate;
unsigned long max_rate;
} debugfs;
+
+ struct icc_provider provider;
+
+ /*
+ * There are multiple sources in the EMC driver which could request
+ * a min/max clock rate, these rates are contained in this array.
+ */
+ struct emc_rate_request requested_rate[EMC_RATE_TYPE_MAX];
+
+ /* protect shared rate-change code path */
+ struct mutex rate_lock;
};

/* Timing change sequence functions */
@@ -870,6 +899,14 @@ static void emc_read_current_timing(struct tegra_emc *emc,
static int emc_init(struct tegra_emc *emc)
{
emc->dram_type = readl(emc->regs + EMC_FBIO_CFG5);
+
+ if (emc->dram_type & EMC_FBIO_CFG5_DRAM_WIDTH_X64)
+ emc->dram_bus_width = 64;
+ else
+ emc->dram_bus_width = 32;
+
+ dev_info(emc->dev, "%ubit DRAM bus\n", emc->dram_bus_width);
+
emc->dram_type &= EMC_FBIO_CFG5_DRAM_TYPE_MASK;
emc->dram_type >>= EMC_FBIO_CFG5_DRAM_TYPE_SHIFT;

@@ -1009,6 +1046,83 @@ tegra_emc_find_node_by_ram_code(struct device_node *node, u32 ram_code)
return NULL;
}

+static void tegra_emc_rate_requests_init(struct tegra_emc *emc)
+{
+ unsigned int i;
+
+ for (i = 0; i < EMC_RATE_TYPE_MAX; i++) {
+ emc->requested_rate[i].min_rate = 0;
+ emc->requested_rate[i].max_rate = ULONG_MAX;
+ }
+}
+
+static int emc_request_rate(struct tegra_emc *emc,
+ unsigned long new_min_rate,
+ unsigned long new_max_rate,
+ enum emc_rate_request_type type)
+{
+ struct emc_rate_request *req = emc->requested_rate;
+ unsigned long min_rate = 0, max_rate = ULONG_MAX;
+ unsigned int i;
+ int err;
+
+ /* select minimum and maximum rates among the requested rates */
+ for (i = 0; i < EMC_RATE_TYPE_MAX; i++, req++) {
+ if (i == type) {
+ min_rate = max(new_min_rate, min_rate);
+ max_rate = min(new_max_rate, max_rate);
+ } else {
+ min_rate = max(req->min_rate, min_rate);
+ max_rate = min(req->max_rate, max_rate);
+ }
+ }
+
+ if (min_rate > max_rate) {
+ dev_err_ratelimited(emc->dev, "%s: type %u: out of range: %lu %lu\n",
+ __func__, type, min_rate, max_rate);
+ return -ERANGE;
+ }
+
+ /*
+ * EMC rate-changes should go via OPP API because it manages voltage
+ * changes.
+ */
+ err = dev_pm_opp_set_rate(emc->dev, min_rate);
+ if (err)
+ return err;
+
+ emc->requested_rate[type].min_rate = new_min_rate;
+ emc->requested_rate[type].max_rate = new_max_rate;
+
+ return 0;
+}
+
+static int emc_set_min_rate(struct tegra_emc *emc, unsigned long rate,
+ enum emc_rate_request_type type)
+{
+ struct emc_rate_request *req = &emc->requested_rate[type];
+ int ret;
+
+ mutex_lock(&emc->rate_lock);
+ ret = emc_request_rate(emc, rate, req->max_rate, type);
+ mutex_unlock(&emc->rate_lock);
+
+ return ret;
+}
+
+static int emc_set_max_rate(struct tegra_emc *emc, unsigned long rate,
+ enum emc_rate_request_type type)
+{
+ struct emc_rate_request *req = &emc->requested_rate[type];
+ int ret;
+
+ mutex_lock(&emc->rate_lock);
+ ret = emc_request_rate(emc, req->min_rate, rate, type);
+ mutex_unlock(&emc->rate_lock);
+
+ return ret;
+}
+
/*
* debugfs interface
*
@@ -1081,7 +1195,7 @@ static int tegra_emc_debug_min_rate_set(void *data, u64 rate)
if (!tegra_emc_validate_rate(emc, rate))
return -EINVAL;

- err = clk_set_min_rate(emc->clk, rate);
+ err = emc_set_min_rate(emc, rate, EMC_RATE_DEBUG);
if (err < 0)
return err;

@@ -1111,7 +1225,7 @@ static int tegra_emc_debug_max_rate_set(void *data, u64 rate)
if (!tegra_emc_validate_rate(emc, rate))
return -EINVAL;

- err = clk_set_max_rate(emc->clk, rate);
+ err = emc_set_max_rate(emc, rate, EMC_RATE_DEBUG);
if (err < 0)
return err;

@@ -1129,15 +1243,6 @@ static void emc_debugfs_init(struct device *dev, struct tegra_emc *emc)
unsigned int i;
int err;

- emc->clk = devm_clk_get(dev, "emc");
- if (IS_ERR(emc->clk)) {
- if (PTR_ERR(emc->clk) != -ENODEV) {
- dev_err(dev, "failed to get EMC clock: %ld\n",
- PTR_ERR(emc->clk));
- return;
- }
- }
-
emc->debugfs.min_rate = ULONG_MAX;
emc->debugfs.max_rate = 0;

@@ -1177,6 +1282,182 @@ static void emc_debugfs_init(struct device *dev, struct tegra_emc *emc)
emc, &tegra_emc_debug_max_rate_fops);
}

+static inline struct tegra_emc *
+to_tegra_emc_provider(struct icc_provider *provider)
+{
+ return container_of(provider, struct tegra_emc, provider);
+}
+
+static struct icc_node_data *
+emc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
+{
+ struct icc_provider *provider = data;
+ struct icc_node_data *ndata;
+ struct icc_node *node;
+
+ /* External Memory is the only possible ICC route */
+ list_for_each_entry(node, &provider->nodes, node_list) {
+ if (node->id != TEGRA_ICC_EMEM)
+ continue;
+
+ ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
+ if (!ndata)
+ return ERR_PTR(-ENOMEM);
+
+ /*
+ * SRC and DST nodes should have matching TAG in order to have
+ * it set by default for a requested path.
+ */
+ ndata->tag = TEGRA_MC_ICC_TAG_ISO;
+ ndata->node = node;
+
+ return ndata;
+ }
+
+ return ERR_PTR(-EPROBE_DEFER);
+}
+
+static int emc_icc_set(struct icc_node *src, struct icc_node *dst)
+{
+ struct tegra_emc *emc = to_tegra_emc_provider(dst->provider);
+ unsigned long long peak_bw = icc_units_to_bps(dst->peak_bw);
+ unsigned long long avg_bw = icc_units_to_bps(dst->avg_bw);
+ unsigned long long rate = max(avg_bw, peak_bw);
+ unsigned int dram_data_bus_width_bytes;
+ const unsigned int ddr = 2;
+ int err;
+
+ /*
+ * Tegra124 EMC runs on a clock rate of SDRAM bus. This means that
+ * EMC clock rate is twice smaller than the peak data rate because
+ * data is sampled on both EMC clock edges.
+ */
+ dram_data_bus_width_bytes = emc->dram_bus_width / 8;
+ do_div(rate, ddr * dram_data_bus_width_bytes);
+ rate = min_t(u64, rate, U32_MAX);
+
+ err = emc_set_min_rate(emc, rate, EMC_RATE_ICC);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int tegra_emc_interconnect_init(struct tegra_emc *emc)
+{
+ const struct tegra_mc_soc *soc = emc->mc->soc;
+ struct icc_node *node;
+ int err;
+
+ emc->provider.dev = emc->dev;
+ emc->provider.set = emc_icc_set;
+ emc->provider.data = &emc->provider;
+ emc->provider.aggregate = soc->icc_ops->aggregate;
+ emc->provider.xlate_extended = emc_of_icc_xlate_extended;
+
+ err = icc_provider_add(&emc->provider);
+ if (err)
+ goto err_msg;
+
+ /* create External Memory Controller node */
+ node = icc_node_create(TEGRA_ICC_EMC);
+ if (IS_ERR(node)) {
+ err = PTR_ERR(node);
+ goto del_provider;
+ }
+
+ node->name = "External Memory Controller";
+ icc_node_add(node, &emc->provider);
+
+ /* link External Memory Controller to External Memory (DRAM) */
+ err = icc_link_create(node, TEGRA_ICC_EMEM);
+ if (err)
+ goto remove_nodes;
+
+ /* create External Memory node */
+ node = icc_node_create(TEGRA_ICC_EMEM);
+ if (IS_ERR(node)) {
+ err = PTR_ERR(node);
+ goto remove_nodes;
+ }
+
+ node->name = "External Memory (DRAM)";
+ icc_node_add(node, &emc->provider);
+
+ return 0;
+
+remove_nodes:
+ icc_nodes_remove(&emc->provider);
+del_provider:
+ icc_provider_del(&emc->provider);
+err_msg:
+ dev_err(emc->dev, "failed to initialize ICC: %d\n", err);
+
+ return err;
+}
+
+static int tegra_emc_opp_table_init(struct tegra_emc *emc)
+{
+ u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
+ struct opp_table *clk_opp_table, *hw_opp_table;
+ int err;
+
+ clk_opp_table = dev_pm_opp_set_clkname(emc->dev, NULL);
+ err = PTR_ERR_OR_ZERO(clk_opp_table);
+ if (err) {
+ dev_err(emc->dev, "failed to set OPP clk: %d\n", err);
+ return err;
+ }
+
+ hw_opp_table = dev_pm_opp_set_supported_hw(emc->dev, &hw_version, 1);
+ err = PTR_ERR_OR_ZERO(hw_opp_table);
+ if (err) {
+ dev_err(emc->dev, "failed to set OPP supported HW: %d\n", err);
+ goto put_clk_table;
+ }
+
+ err = dev_pm_opp_of_add_table(emc->dev);
+ if (err) {
+ /*
+ * -ENODEV means that this is an older device-tree which
+ * doesn't have OPP table and EMC driver isn't very useful
+ * in this case.
+ */
+ if (err == -ENODEV)
+ dev_err(emc->dev, "OPP table not found, please update your device tree\n");
+ else
+ dev_err(emc->dev, "failed to add OPP table: %d\n", err);
+
+ goto put_hw_table;
+ }
+
+ dev_info(emc->dev, "OPP HW ver. 0x%x, current clock rate %lu MHz\n",
+ hw_version, clk_get_rate(emc->clk) / 1000000);
+
+ /* first dummy rate-set initializes voltage state */
+ err = dev_pm_opp_set_rate(emc->dev, clk_get_rate(emc->clk));
+ if (err) {
+ dev_err(emc->dev, "failed to initialize OPP clock: %d\n", err);
+ goto remove_table;
+ }
+
+ return 0;
+
+remove_table:
+ dev_pm_opp_of_remove_table(emc->dev);
+put_hw_table:
+ dev_pm_opp_put_supported_hw(hw_opp_table);
+put_clk_table:
+ dev_pm_opp_put_clkname(clk_opp_table);
+
+ return err;
+}
+
+static void devm_tegra_emc_unset_callback(void *data)
+{
+ tegra124_clk_set_emc_callbacks(NULL, NULL);
+}
+
static int tegra_emc_probe(struct platform_device *pdev)
{
struct device_node *np;
@@ -1188,6 +1469,7 @@ static int tegra_emc_probe(struct platform_device *pdev)
if (!emc)
return -ENOMEM;

+ mutex_init(&emc->rate_lock);
emc->dev = &pdev->dev;

emc->regs = devm_platform_ioremap_resource(pdev, 0);
@@ -1223,9 +1505,29 @@ static int tegra_emc_probe(struct platform_device *pdev)
tegra124_clk_set_emc_callbacks(tegra_emc_prepare_timing_change,
tegra_emc_complete_timing_change);

+ err = devm_add_action_or_reset(&pdev->dev, devm_tegra_emc_unset_callback,
+ NULL);
+ if (err)
+ return err;
+
+ emc->clk = devm_clk_get(&pdev->dev, "emc");
+ if (IS_ERR(emc->clk)) {
+ err = PTR_ERR(emc->clk);
+ dev_err(&pdev->dev, "failed to get EMC clock: %d\n", err);
+ return err;
+ }
+
+ err = tegra_emc_opp_table_init(emc);
+ if (err)
+ return err;
+
+ tegra_emc_rate_requests_init(emc);
+
if (IS_ENABLED(CONFIG_DEBUG_FS))
emc_debugfs_init(&pdev->dev, emc);

+ tegra_emc_interconnect_init(emc);
+
/*
* Don't allow the kernel module to be unloaded. Unloading adds some
* extra complexity which doesn't really worth the effort in a case of
@@ -1242,6 +1544,7 @@ static struct platform_driver tegra_emc_driver = {
.name = "tegra-emc",
.of_match_table = tegra_emc_of_match,
.suppress_bind_attrs = true,
+ .sync_state = icc_sync_state,
},
};
module_platform_driver(tegra_emc_driver);
diff --git a/drivers/memory/tegra/tegra124.c b/drivers/memory/tegra/tegra124.c
index e2389573d3c0..459211f50c08 100644
--- a/drivers/memory/tegra/tegra124.c
+++ b/drivers/memory/tegra/tegra124.c
@@ -4,7 +4,8 @@
*/

#include <linux/of.h>
-#include <linux/mm.h>
+#include <linux/of_device.h>
+#include <linux/slab.h>

#include <dt-bindings/memory/tegra124-mc.h>

@@ -1010,6 +1011,83 @@ static const struct tegra_mc_reset tegra124_mc_resets[] = {
TEGRA124_MC_RESET(GPU, 0x970, 0x974, 2),
};

+static int tegra124_mc_icc_set(struct icc_node *src, struct icc_node *dst)
+{
+ /* TODO: program PTSA */
+ return 0;
+}
+
+static int tegra124_mc_icc_aggreate(struct icc_node *node, u32 tag, u32 avg_bw,
+ u32 peak_bw, u32 *agg_avg, u32 *agg_peak)
+{
+ /*
+ * ISO clients need to reserve extra bandwidth up-front because
+ * there could be high bandwidth pressure during initial filling
+ * of the client's FIFO buffers. Secondly, we need to take into
+ * account impurities of the memory subsystem.
+ */
+ if (tag & TEGRA_MC_ICC_TAG_ISO)
+ peak_bw = tegra_mc_scale_percents(peak_bw, 400);
+
+ *agg_avg += avg_bw;
+ *agg_peak = max(*agg_peak, peak_bw);
+
+ return 0;
+}
+
+static struct icc_node_data *
+tegra124_mc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
+{
+ struct tegra_mc *mc = icc_provider_to_tegra_mc(data);
+ const struct tegra_mc_client *client;
+ unsigned int i, idx = spec->args[0];
+ struct icc_node_data *ndata;
+ struct icc_node *node;
+
+ list_for_each_entry(node, &mc->provider.nodes, node_list) {
+ if (node->id != idx)
+ continue;
+
+ ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
+ if (!ndata)
+ return ERR_PTR(-ENOMEM);
+
+ client = &mc->soc->clients[idx];
+ ndata->node = node;
+
+ switch (client->swgroup) {
+ case TEGRA_SWGROUP_DC:
+ case TEGRA_SWGROUP_DCB:
+ case TEGRA_SWGROUP_PTC:
+ case TEGRA_SWGROUP_VI:
+ /* these clients are isochronous by default */
+ ndata->tag = TEGRA_MC_ICC_TAG_ISO;
+ break;
+
+ default:
+ ndata->tag = TEGRA_MC_ICC_TAG_DEFAULT;
+ break;
+ }
+
+ return ndata;
+ }
+
+ for (i = 0; i < mc->soc->num_clients; i++) {
+ if (mc->soc->clients[i].id == idx)
+ return ERR_PTR(-EPROBE_DEFER);
+ }
+
+ dev_err(mc->dev, "invalid ICC client ID %u\n", idx);
+
+ return ERR_PTR(-EINVAL);
+}
+
+static const struct tegra_mc_icc_ops tegra124_mc_icc_ops = {
+ .xlate_extended = tegra124_mc_of_icc_xlate_extended,
+ .aggregate = tegra124_mc_icc_aggreate,
+ .set = tegra124_mc_icc_set,
+};
+
#ifdef CONFIG_ARCH_TEGRA_124_SOC
static const unsigned long tegra124_mc_emem_regs[] = {
MC_EMEM_ARB_CFG,
@@ -1061,6 +1139,7 @@ const struct tegra_mc_soc tegra124_mc_soc = {
.reset_ops = &tegra_mc_reset_ops_common,
.resets = tegra124_mc_resets,
.num_resets = ARRAY_SIZE(tegra124_mc_resets),
+ .icc_ops = &tegra124_mc_icc_ops,
};
#endif /* CONFIG_ARCH_TEGRA_124_SOC */

@@ -1091,5 +1170,6 @@ const struct tegra_mc_soc tegra132_mc_soc = {
.reset_ops = &tegra_mc_reset_ops_common,
.resets = tegra124_mc_resets,
.num_resets = ARRAY_SIZE(tegra124_mc_resets),
+ .icc_ops = &tegra124_mc_icc_ops,
};
#endif /* CONFIG_ARCH_TEGRA_132_SOC */
--
2.29.2

2020-11-15 21:39:10

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 15/17] ARM: tegra: Add EMC OPP properties to Tegra20 device-trees

Add EMC OPP DVFS tables and update board device-trees by removing
unsupported OPPs.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
.../boot/dts/tegra20-acer-a500-picasso.dts | 5 +
arch/arm/boot/dts/tegra20-colibri.dtsi | 4 +
arch/arm/boot/dts/tegra20-paz00.dts | 4 +
.../arm/boot/dts/tegra20-peripherals-opp.dtsi | 92 +++++++++++++++++++
arch/arm/boot/dts/tegra20.dtsi | 3 +
5 files changed, 108 insertions(+)
create mode 100644 arch/arm/boot/dts/tegra20-peripherals-opp.dtsi

diff --git a/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts b/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts
index dd6fb134ee39..a29b44837855 100644
--- a/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts
+++ b/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts
@@ -1451,3 +1451,8 @@ emc-table@300000 {
};
};
};
+
+&emc_icc_dvfs_opp_table {
+ /delete-node/ opp@666000000;
+ /delete-node/ opp@760000000;
+};
diff --git a/arch/arm/boot/dts/tegra20-colibri.dtsi b/arch/arm/boot/dts/tegra20-colibri.dtsi
index 6162d193e12c..585a5b441cf6 100644
--- a/arch/arm/boot/dts/tegra20-colibri.dtsi
+++ b/arch/arm/boot/dts/tegra20-colibri.dtsi
@@ -742,6 +742,10 @@ sound {
};
};

+&emc_icc_dvfs_opp_table {
+ /delete-node/ opp@760000000;
+};
+
&gpio {
lan-reset-n {
gpio-hog;
diff --git a/arch/arm/boot/dts/tegra20-paz00.dts b/arch/arm/boot/dts/tegra20-paz00.dts
index ada2bed8b1b5..7e49112cd9a1 100644
--- a/arch/arm/boot/dts/tegra20-paz00.dts
+++ b/arch/arm/boot/dts/tegra20-paz00.dts
@@ -662,3 +662,7 @@ cpu@1 {
};
};
};
+
+&emc_icc_dvfs_opp_table {
+ /delete-node/ opp@760000000;
+};
diff --git a/arch/arm/boot/dts/tegra20-peripherals-opp.dtsi b/arch/arm/boot/dts/tegra20-peripherals-opp.dtsi
new file mode 100644
index 000000000000..25b1ba73951e
--- /dev/null
+++ b/arch/arm/boot/dts/tegra20-peripherals-opp.dtsi
@@ -0,0 +1,92 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/ {
+ emc_icc_dvfs_opp_table: emc-dvfs-opp-table {
+ compatible = "operating-points-v2";
+
+ opp@36000000 {
+ opp-microvolt = <950000 950000 1300000>;
+ opp-hz = /bits/ 64 <36000000>;
+ };
+
+ opp@47500000 {
+ opp-microvolt = <950000 950000 1300000>;
+ opp-hz = /bits/ 64 <47500000>;
+ };
+
+ opp@50000000 {
+ opp-microvolt = <950000 950000 1300000>;
+ opp-hz = /bits/ 64 <50000000>;
+ };
+
+ opp@54000000 {
+ opp-microvolt = <950000 950000 1300000>;
+ opp-hz = /bits/ 64 <54000000>;
+ };
+
+ opp@57000000 {
+ opp-microvolt = <950000 950000 1300000>;
+ opp-hz = /bits/ 64 <57000000>;
+ };
+
+ opp@100000000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <100000000>;
+ };
+
+ opp@108000000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <108000000>;
+ };
+
+ opp@126666000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <126666000>;
+ };
+
+ opp@150000000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <150000000>;
+ };
+
+ opp@190000000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <190000000>;
+ };
+
+ opp@216000000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <216000000>;
+ };
+
+ opp@300000000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <300000000>;
+ };
+
+ opp@333000000 {
+ opp-microvolt = <1000000 1000000 1300000>;
+ opp-hz = /bits/ 64 <333000000>;
+ };
+
+ opp@380000000 {
+ opp-microvolt = <1100000 1100000 1300000>;
+ opp-hz = /bits/ 64 <380000000>;
+ };
+
+ opp@600000000 {
+ opp-microvolt = <1200000 1200000 1300000>;
+ opp-hz = /bits/ 64 <600000000>;
+ };
+
+ opp@666000000 {
+ opp-microvolt = <1200000 1200000 1300000>;
+ opp-hz = /bits/ 64 <666000000>;
+ };
+
+ opp@760000000 {
+ opp-microvolt = <1300000 1300000 1300000>;
+ opp-hz = /bits/ 64 <760000000>;
+ };
+ };
+};
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 8f8ad81916e7..6ce498178105 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -6,6 +6,8 @@
#include <dt-bindings/interrupt-controller/arm-gic.h>
#include <dt-bindings/soc/tegra-pmc.h>

+#include "tegra20-peripherals-opp.dtsi"
+
/ {
compatible = "nvidia,tegra20";
interrupt-parent = <&lic>;
@@ -664,6 +666,7 @@ emc: memory-controller@7000f400 {
#size-cells = <0>;
#interconnect-cells = <0>;

+ operating-points-v2 = <&emc_icc_dvfs_opp_table>;
nvidia,memory-controller = <&mc>;
};

--
2.29.2

2020-11-15 21:39:10

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 13/17] ARM: tegra: Add interconnect properties to Tegra124 device-tree

Add interconnect properties to the Memory Controller, External Memory
Controller and the Display Controller nodes in order to describe hardware
interconnection.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
arch/arm/boot/dts/tegra124.dtsi | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 64f488ba1e72..1801e30b1d3a 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -113,6 +113,19 @@ dc@54200000 {
iommus = <&mc TEGRA_SWGROUP_DC>;

nvidia,head = <0>;
+
+ interconnects = <&mc TEGRA124_MC_DISPLAY0A &emc>,
+ <&mc TEGRA124_MC_DISPLAY0B &emc>,
+ <&mc TEGRA124_MC_DISPLAY0C &emc>,
+ <&mc TEGRA124_MC_DISPLAYHC &emc>,
+ <&mc TEGRA124_MC_DISPLAYD &emc>,
+ <&mc TEGRA124_MC_DISPLAYT &emc>;
+ interconnect-names = "wina",
+ "winb",
+ "winc",
+ "cursor",
+ "wind",
+ "wint";
};

dc@54240000 {
@@ -127,6 +140,15 @@ dc@54240000 {
iommus = <&mc TEGRA_SWGROUP_DCB>;

nvidia,head = <1>;
+
+ interconnects = <&mc TEGRA124_MC_DISPLAY0AB &emc>,
+ <&mc TEGRA124_MC_DISPLAY0BB &emc>,
+ <&mc TEGRA124_MC_DISPLAY0CB &emc>,
+ <&mc TEGRA124_MC_DISPLAYHCB &emc>;
+ interconnect-names = "wina",
+ "winb",
+ "winc",
+ "cursor";
};

hdmi: hdmi@54280000 {
@@ -628,6 +650,7 @@ mc: memory-controller@70019000 {

#iommu-cells = <1>;
#reset-cells = <1>;
+ #interconnect-cells = <1>;
};

emc: external-memory-controller@7001b000 {
@@ -637,6 +660,8 @@ emc: external-memory-controller@7001b000 {
clock-names = "emc";

nvidia,memory-controller = <&mc>;
+
+ #interconnect-cells = <0>;
};

sata@70020000 {
--
2.29.2

2020-11-15 21:39:13

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 01/17] memory: tegra30: Support interconnect framework

Now Internal and External memory controllers are memory interconnection
providers. This allows us to use interconnect API for tuning of memory
configuration. EMC driver now supports OPPs and DVFS. MC driver now
supports tuning of memory arbitration latency, which needs to be done
for ISO memory clients, like a Display client for example.

Tested-by: Peter Geis <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/memory/tegra/Kconfig | 1 +
drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++--
drivers/memory/tegra/tegra30.c | 173 +++++++++++++-
3 files changed, 501 insertions(+), 22 deletions(-)

diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig
index 2a4a16bcf91c..ca7077a06f4c 100644
--- a/drivers/memory/tegra/Kconfig
+++ b/drivers/memory/tegra/Kconfig
@@ -24,6 +24,7 @@ config TEGRA30_EMC
tristate "NVIDIA Tegra30 External Memory Controller driver"
default y
depends on TEGRA_MC && ARCH_TEGRA_3x_SOC
+ select PM_OPP
help
This driver is for the External Memory Controller (EMC) found on
Tegra30 chips. The EMC controls the external DRAM on the board.
diff --git a/drivers/memory/tegra/tegra30-emc.c b/drivers/memory/tegra/tegra30-emc.c
index 3488786da03b..1974b067789f 100644
--- a/drivers/memory/tegra/tegra30-emc.c
+++ b/drivers/memory/tegra/tegra30-emc.c
@@ -14,16 +14,21 @@
#include <linux/debugfs.h>
#include <linux/delay.h>
#include <linux/err.h>
+#include <linux/interconnect-provider.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/iopoll.h>
#include <linux/kernel.h>
#include <linux/module.h>
+#include <linux/mutex.h>
#include <linux/of_platform.h>
#include <linux/platform_device.h>
+#include <linux/pm_opp.h>
+#include <linux/slab.h>
#include <linux/sort.h>
#include <linux/types.h>

+#include <soc/tegra/common.h>
#include <soc/tegra/fuse.h>

#include "mc.h"
@@ -323,9 +328,21 @@ struct emc_timing {
bool emc_cfg_dyn_self_ref;
};

+enum emc_rate_request_type {
+ EMC_RATE_DEBUG,
+ EMC_RATE_ICC,
+ EMC_RATE_TYPE_MAX,
+};
+
+struct emc_rate_request {
+ unsigned long min_rate;
+ unsigned long max_rate;
+};
+
struct tegra_emc {
struct device *dev;
struct tegra_mc *mc;
+ struct icc_provider provider;
struct notifier_block clk_nb;
struct clk *clk;
void __iomem *regs;
@@ -352,6 +369,15 @@ struct tegra_emc {
unsigned long min_rate;
unsigned long max_rate;
} debugfs;
+
+ /*
+ * There are multiple sources in the EMC driver which could request
+ * a min/max clock rate, these rates are contained in this array.
+ */
+ struct emc_rate_request requested_rate[EMC_RATE_TYPE_MAX];
+
+ /* protect shared rate-change code path */
+ struct mutex rate_lock;
};

static int emc_seq_update_timing(struct tegra_emc *emc)
@@ -1094,6 +1120,83 @@ static long emc_round_rate(unsigned long rate,
return timing->rate;
}

+static void tegra_emc_rate_requests_init(struct tegra_emc *emc)
+{
+ unsigned int i;
+
+ for (i = 0; i < EMC_RATE_TYPE_MAX; i++) {
+ emc->requested_rate[i].min_rate = 0;
+ emc->requested_rate[i].max_rate = ULONG_MAX;
+ }
+}
+
+static int emc_request_rate(struct tegra_emc *emc,
+ unsigned long new_min_rate,
+ unsigned long new_max_rate,
+ enum emc_rate_request_type type)
+{
+ struct emc_rate_request *req = emc->requested_rate;
+ unsigned long min_rate = 0, max_rate = ULONG_MAX;
+ unsigned int i;
+ int err;
+
+ /* select minimum and maximum rates among the requested rates */
+ for (i = 0; i < EMC_RATE_TYPE_MAX; i++, req++) {
+ if (i == type) {
+ min_rate = max(new_min_rate, min_rate);
+ max_rate = min(new_max_rate, max_rate);
+ } else {
+ min_rate = max(req->min_rate, min_rate);
+ max_rate = min(req->max_rate, max_rate);
+ }
+ }
+
+ if (min_rate > max_rate) {
+ dev_err_ratelimited(emc->dev, "%s: type %u: out of range: %lu %lu\n",
+ __func__, type, min_rate, max_rate);
+ return -ERANGE;
+ }
+
+ /*
+ * EMC rate-changes should go via OPP API because it manages voltage
+ * changes.
+ */
+ err = dev_pm_opp_set_rate(emc->dev, min_rate);
+ if (err)
+ return err;
+
+ emc->requested_rate[type].min_rate = new_min_rate;
+ emc->requested_rate[type].max_rate = new_max_rate;
+
+ return 0;
+}
+
+static int emc_set_min_rate(struct tegra_emc *emc, unsigned long rate,
+ enum emc_rate_request_type type)
+{
+ struct emc_rate_request *req = &emc->requested_rate[type];
+ int ret;
+
+ mutex_lock(&emc->rate_lock);
+ ret = emc_request_rate(emc, rate, req->max_rate, type);
+ mutex_unlock(&emc->rate_lock);
+
+ return ret;
+}
+
+static int emc_set_max_rate(struct tegra_emc *emc, unsigned long rate,
+ enum emc_rate_request_type type)
+{
+ struct emc_rate_request *req = &emc->requested_rate[type];
+ int ret;
+
+ mutex_lock(&emc->rate_lock);
+ ret = emc_request_rate(emc, req->min_rate, rate, type);
+ mutex_unlock(&emc->rate_lock);
+
+ return ret;
+}
+
/*
* debugfs interface
*
@@ -1177,7 +1280,7 @@ static int tegra_emc_debug_min_rate_set(void *data, u64 rate)
if (!tegra_emc_validate_rate(emc, rate))
return -EINVAL;

- err = clk_set_min_rate(emc->clk, rate);
+ err = emc_set_min_rate(emc, rate, EMC_RATE_DEBUG);
if (err < 0)
return err;

@@ -1207,7 +1310,7 @@ static int tegra_emc_debug_max_rate_set(void *data, u64 rate)
if (!tegra_emc_validate_rate(emc, rate))
return -EINVAL;

- err = clk_set_max_rate(emc->clk, rate);
+ err = emc_set_max_rate(emc, rate, EMC_RATE_DEBUG);
if (err < 0)
return err;

@@ -1264,6 +1367,219 @@ static void tegra_emc_debugfs_init(struct tegra_emc *emc)
emc, &tegra_emc_debug_max_rate_fops);
}

+static inline struct tegra_emc *
+to_tegra_emc_provider(struct icc_provider *provider)
+{
+ return container_of(provider, struct tegra_emc, provider);
+}
+
+static struct icc_node_data *
+emc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
+{
+ struct icc_provider *provider = data;
+ struct icc_node_data *ndata;
+ struct icc_node *node;
+
+ /* External Memory is the only possible ICC route */
+ list_for_each_entry(node, &provider->nodes, node_list) {
+ if (node->id != TEGRA_ICC_EMEM)
+ continue;
+
+ ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
+ if (!ndata)
+ return ERR_PTR(-ENOMEM);
+
+ /*
+ * SRC and DST nodes should have matching TAG in order to have
+ * it set by default for a requested path.
+ */
+ ndata->tag = TEGRA_MC_ICC_TAG_ISO;
+ ndata->node = node;
+
+ return ndata;
+ }
+
+ return ERR_PTR(-EPROBE_DEFER);
+}
+
+static int emc_icc_set(struct icc_node *src, struct icc_node *dst)
+{
+ struct tegra_emc *emc = to_tegra_emc_provider(dst->provider);
+ unsigned long long peak_bw = icc_units_to_bps(dst->peak_bw);
+ unsigned long long avg_bw = icc_units_to_bps(dst->avg_bw);
+ unsigned long long rate = max(avg_bw, peak_bw);
+ const unsigned int dram_data_bus_width_bytes = 4;
+ const unsigned int ddr = 2;
+ int err;
+
+ /*
+ * Tegra30 EMC runs on a clock rate of SDRAM bus. This means that
+ * EMC clock rate is twice smaller than the peak data rate because
+ * data is sampled on both EMC clock edges.
+ */
+ do_div(rate, ddr * dram_data_bus_width_bytes);
+ rate = min_t(u64, rate, U32_MAX);
+
+ err = emc_set_min_rate(emc, rate, EMC_RATE_ICC);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int tegra_emc_interconnect_init(struct tegra_emc *emc)
+{
+ const struct tegra_mc_soc *soc = emc->mc->soc;
+ struct icc_node *node;
+ int err;
+
+ emc->provider.dev = emc->dev;
+ emc->provider.set = emc_icc_set;
+ emc->provider.data = &emc->provider;
+ emc->provider.aggregate = soc->icc_ops->aggregate;
+ emc->provider.xlate_extended = emc_of_icc_xlate_extended;
+
+ err = icc_provider_add(&emc->provider);
+ if (err)
+ goto err_msg;
+
+ /* create External Memory Controller node */
+ node = icc_node_create(TEGRA_ICC_EMC);
+ if (IS_ERR(node)) {
+ err = PTR_ERR(node);
+ goto del_provider;
+ }
+
+ node->name = "External Memory Controller";
+ icc_node_add(node, &emc->provider);
+
+ /* link External Memory Controller to External Memory (DRAM) */
+ err = icc_link_create(node, TEGRA_ICC_EMEM);
+ if (err)
+ goto remove_nodes;
+
+ /* create External Memory node */
+ node = icc_node_create(TEGRA_ICC_EMEM);
+ if (IS_ERR(node)) {
+ err = PTR_ERR(node);
+ goto remove_nodes;
+ }
+
+ node->name = "External Memory (DRAM)";
+ icc_node_add(node, &emc->provider);
+
+ return 0;
+
+remove_nodes:
+ icc_nodes_remove(&emc->provider);
+del_provider:
+ icc_provider_del(&emc->provider);
+err_msg:
+ dev_err(emc->dev, "failed to initialize ICC: %d\n", err);
+
+ return err;
+}
+
+static int tegra_emc_opp_table_init(struct tegra_emc *emc)
+{
+ u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
+ struct opp_table *clk_opp_table, *hw_opp_table;
+ int err;
+
+ clk_opp_table = dev_pm_opp_set_clkname(emc->dev, NULL);
+ err = PTR_ERR_OR_ZERO(clk_opp_table);
+ if (err) {
+ dev_err(emc->dev, "failed to set OPP clk: %d\n", err);
+ return err;
+ }
+
+ hw_opp_table = dev_pm_opp_set_supported_hw(emc->dev, &hw_version, 1);
+ err = PTR_ERR_OR_ZERO(hw_opp_table);
+ if (err) {
+ dev_err(emc->dev, "failed to set OPP supported HW: %d\n", err);
+ goto put_clk_table;
+ }
+
+ err = dev_pm_opp_of_add_table(emc->dev);
+ if (err) {
+ /*
+ * -ENODEV means that this is an older device-tree which
+ * doesn't have OPP table and EMC driver isn't very useful
+ * in this case.
+ */
+ if (err == -ENODEV)
+ dev_err(emc->dev, "OPP table not found, please update your device tree\n");
+ else
+ dev_err(emc->dev, "failed to add OPP table: %d\n", err);
+
+ goto put_hw_table;
+ }
+
+ dev_info(emc->dev, "OPP HW ver. 0x%x, current clock rate %lu MHz\n",
+ hw_version, clk_get_rate(emc->clk) / 1000000);
+
+ /* first dummy rate-set initializes voltage state */
+ err = dev_pm_opp_set_rate(emc->dev, clk_get_rate(emc->clk));
+ if (err) {
+ dev_err(emc->dev, "failed to initialize OPP clock: %d\n", err);
+ goto remove_table;
+ }
+
+ return 0;
+
+remove_table:
+ dev_pm_opp_of_remove_table(emc->dev);
+put_hw_table:
+ dev_pm_opp_put_supported_hw(hw_opp_table);
+put_clk_table:
+ dev_pm_opp_put_clkname(clk_opp_table);
+
+ return err;
+}
+
+static void devm_tegra_emc_unset_callback(void *data)
+{
+ tegra20_clk_set_emc_round_callback(NULL, NULL);
+}
+
+static void devm_tegra_emc_unreg_clk_notifier(void *data)
+{
+ struct tegra_emc *emc = data;
+
+ clk_notifier_unregister(emc->clk, &emc->clk_nb);
+}
+
+static int tegra_emc_init_clk(struct tegra_emc *emc)
+{
+ int err;
+
+ tegra20_clk_set_emc_round_callback(emc_round_rate, emc);
+
+ err = devm_add_action_or_reset(emc->dev, devm_tegra_emc_unset_callback,
+ NULL);
+ if (err)
+ return err;
+
+ emc->clk = devm_clk_get(emc->dev, NULL);
+ if (IS_ERR(emc->clk)) {
+ dev_err(emc->dev, "failed to get EMC clock: %pe\n", emc->clk);
+ return PTR_ERR(emc->clk);
+ }
+
+ err = clk_notifier_register(emc->clk, &emc->clk_nb);
+ if (err) {
+ dev_err(emc->dev, "failed to register clk notifier: %d\n", err);
+ return err;
+ }
+
+ err = devm_add_action_or_reset(emc->dev,
+ devm_tegra_emc_unreg_clk_notifier, emc);
+ if (err)
+ return err;
+
+ return 0;
+}
+
static int tegra_emc_probe(struct platform_device *pdev)
{
struct device_node *np;
@@ -1280,6 +1596,7 @@ static int tegra_emc_probe(struct platform_device *pdev)
if (IS_ERR(emc->mc))
return PTR_ERR(emc->mc);

+ mutex_init(&emc->rate_lock);
emc->clk_nb.notifier_call = emc_clk_change_notify;
emc->dev = &pdev->dev;

@@ -1312,24 +1629,18 @@ static int tegra_emc_probe(struct platform_device *pdev)
return err;
}

- tegra20_clk_set_emc_round_callback(emc_round_rate, emc);
-
- emc->clk = devm_clk_get(&pdev->dev, "emc");
- if (IS_ERR(emc->clk)) {
- err = PTR_ERR(emc->clk);
- dev_err(&pdev->dev, "failed to get emc clock: %d\n", err);
- goto unset_cb;
- }
+ err = tegra_emc_init_clk(emc);
+ if (err)
+ return err;

- err = clk_notifier_register(emc->clk, &emc->clk_nb);
- if (err) {
- dev_err(&pdev->dev, "failed to register clk notifier: %d\n",
- err);
- goto unset_cb;
- }
+ err = tegra_emc_opp_table_init(emc);
+ if (err)
+ return err;

platform_set_drvdata(pdev, emc);
+ tegra_emc_rate_requests_init(emc);
tegra_emc_debugfs_init(emc);
+ tegra_emc_interconnect_init(emc);

/*
* Don't allow the kernel module to be unloaded. Unloading adds some
@@ -1339,11 +1650,6 @@ static int tegra_emc_probe(struct platform_device *pdev)
try_module_get(THIS_MODULE);

return 0;
-
-unset_cb:
- tegra20_clk_set_emc_round_callback(NULL, NULL);
-
- return err;
}

static int tegra_emc_suspend(struct device *dev)
@@ -1397,6 +1703,7 @@ static struct platform_driver tegra_emc_driver = {
.of_match_table = tegra_emc_of_match,
.pm = &tegra_emc_pm_ops,
.suppress_bind_attrs = true,
+ .sync_state = icc_sync_state,
},
};
module_platform_driver(tegra_emc_driver);
diff --git a/drivers/memory/tegra/tegra30.c b/drivers/memory/tegra/tegra30.c
index d0314f29608d..ea849003014b 100644
--- a/drivers/memory/tegra/tegra30.c
+++ b/drivers/memory/tegra/tegra30.c
@@ -4,7 +4,8 @@
*/

#include <linux/of.h>
-#include <linux/mm.h>
+#include <linux/of_device.h>
+#include <linux/slab.h>

#include <dt-bindings/memory/tegra30-mc.h>

@@ -1083,6 +1084,175 @@ static const struct tegra_mc_reset tegra30_mc_resets[] = {
TEGRA30_MC_RESET(VI, 0x200, 0x204, 17),
};

+static void tegra30_mc_tune_client_latency(struct tegra_mc *mc,
+ const struct tegra_mc_client *client,
+ unsigned int bandwidth_mbytes_sec)
+{
+ u32 arb_tolerance_compensation_nsec, arb_tolerance_compensation_div;
+ const struct tegra_mc_la *la = &client->la;
+ unsigned int fifo_size = client->fifo_size;
+ u32 arb_nsec, la_ticks, value;
+
+ /* see 18.4.1 Client Configuration in Tegra3 TRM v03p */
+ if (bandwidth_mbytes_sec)
+ arb_nsec = fifo_size * NSEC_PER_USEC / bandwidth_mbytes_sec;
+ else
+ arb_nsec = U32_MAX;
+
+ /*
+ * Latency allowness should be set with consideration for the module's
+ * latency tolerance and internal buffering capabilities.
+ *
+ * Display memory clients use isochronous transfers and have very low
+ * tolerance to a belated transfers. Hence we need to compensate the
+ * memory arbitration imperfection for them in order to prevent FIFO
+ * underflow condition when memory bus is busy.
+ *
+ * VI clients also need a stronger compensation.
+ */
+ switch (client->swgroup) {
+ case TEGRA_SWGROUP_MPCORE:
+ case TEGRA_SWGROUP_PTC:
+ /*
+ * We always want lower latency for these clients, hence
+ * don't touch them.
+ */
+ return;
+
+ case TEGRA_SWGROUP_DC:
+ case TEGRA_SWGROUP_DCB:
+ arb_tolerance_compensation_nsec = 1050;
+ arb_tolerance_compensation_div = 2;
+ break;
+
+ case TEGRA_SWGROUP_VI:
+ arb_tolerance_compensation_nsec = 1050;
+ arb_tolerance_compensation_div = 1;
+ break;
+
+ default:
+ arb_tolerance_compensation_nsec = 150;
+ arb_tolerance_compensation_div = 1;
+ break;
+ }
+
+ if (arb_nsec > arb_tolerance_compensation_nsec)
+ arb_nsec -= arb_tolerance_compensation_nsec;
+ else
+ arb_nsec = 0;
+
+ arb_nsec /= arb_tolerance_compensation_div;
+
+ /*
+ * Latency allowance is a number of ticks a request from a particular
+ * client may wait in the EMEM arbiter before it becomes a high-priority
+ * request.
+ */
+ la_ticks = arb_nsec / mc->tick;
+ la_ticks = min(la_ticks, la->mask);
+
+ value = mc_readl(mc, la->reg);
+ value &= ~(la->mask << la->shift);
+ value |= la_ticks << la->shift;
+ mc_writel(mc, value, la->reg);
+}
+
+static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node *dst)
+{
+ struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider);
+ const struct tegra_mc_client *client = &mc->soc->clients[src->id];
+ u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
+
+ /*
+ * Skip pre-initialization that is done by icc_node_add(), which sets
+ * bandwidth to maximum for all clients before drivers are loaded.
+ *
+ * This doesn't make sense for us because we don't have drivers for all
+ * clients and it's okay to keep configuration left from bootloader
+ * during boot, at least for today.
+ */
+ if (src == dst)
+ return 0;
+
+ /* convert bytes/sec to megabytes/sec */
+ do_div(peak_bandwidth, 1000000);
+
+ tegra30_mc_tune_client_latency(mc, client, peak_bandwidth);
+
+ return 0;
+}
+
+static int tegra30_mc_icc_aggreate(struct icc_node *node, u32 tag, u32 avg_bw,
+ u32 peak_bw, u32 *agg_avg, u32 *agg_peak)
+{
+ /*
+ * ISO clients need to reserve extra bandwidth up-front because
+ * there could be high bandwidth pressure during initial filling
+ * of the client's FIFO buffers. Secondly, we need to take into
+ * account impurities of the memory subsystem.
+ */
+ if (tag & TEGRA_MC_ICC_TAG_ISO)
+ peak_bw = tegra_mc_scale_percents(peak_bw, 400);
+
+ *agg_avg += avg_bw;
+ *agg_peak = max(*agg_peak, peak_bw);
+
+ return 0;
+}
+
+static struct icc_node_data *
+tegra30_mc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
+{
+ struct tegra_mc *mc = icc_provider_to_tegra_mc(data);
+ const struct tegra_mc_client *client;
+ unsigned int i, idx = spec->args[0];
+ struct icc_node_data *ndata;
+ struct icc_node *node;
+
+ list_for_each_entry(node, &mc->provider.nodes, node_list) {
+ if (node->id != idx)
+ continue;
+
+ ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
+ if (!ndata)
+ return ERR_PTR(-ENOMEM);
+
+ client = &mc->soc->clients[idx];
+ ndata->node = node;
+
+ switch (client->swgroup) {
+ case TEGRA_SWGROUP_DC:
+ case TEGRA_SWGROUP_DCB:
+ case TEGRA_SWGROUP_PTC:
+ case TEGRA_SWGROUP_VI:
+ /* these clients are isochronous by default */
+ ndata->tag = TEGRA_MC_ICC_TAG_ISO;
+ break;
+
+ default:
+ ndata->tag = TEGRA_MC_ICC_TAG_DEFAULT;
+ break;
+ }
+
+ return ndata;
+ }
+
+ for (i = 0; i < mc->soc->num_clients; i++) {
+ if (mc->soc->clients[i].id == idx)
+ return ERR_PTR(-EPROBE_DEFER);
+ }
+
+ dev_err(mc->dev, "invalid ICC client ID %u\n", idx);
+
+ return ERR_PTR(-EINVAL);
+}
+
+static const struct tegra_mc_icc_ops tegra30_mc_icc_ops = {
+ .xlate_extended = tegra30_mc_of_icc_xlate_extended,
+ .aggregate = tegra30_mc_icc_aggreate,
+ .set = tegra30_mc_icc_set,
+};
+
const struct tegra_mc_soc tegra30_mc_soc = {
.clients = tegra30_mc_clients,
.num_clients = ARRAY_SIZE(tegra30_mc_clients),
@@ -1097,4 +1267,5 @@ const struct tegra_mc_soc tegra30_mc_soc = {
.reset_ops = &tegra_mc_reset_ops_common,
.resets = tegra30_mc_resets,
.num_resets = ARRAY_SIZE(tegra30_mc_resets),
+ .icc_ops = &tegra30_mc_icc_ops,
};
--
2.29.2

2020-11-15 21:40:56

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 07/17] PM / devfreq: tegra30: Support interconnect and OPPs from device-tree

This patch moves ACTMON driver away from generating OPP table by itself,
transitioning it to use the table which comes from device-tree. This
change breaks compatibility with older device-trees in order to bring
support for the interconnect framework to the driver. This is a mandatory
change which needs to be done in order to implement interconnect-based
memory DVFS. Users of legacy device-trees will get a message telling that
theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.

Tested-by: Peter Geis <[email protected]>
Tested-by: Nicolas Chauvet <[email protected]>
Acked-by: Chanwoo Choi <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++---------------
1 file changed, 44 insertions(+), 42 deletions(-)

diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
index 38cc0d014738..ed6d4469c8c7 100644
--- a/drivers/devfreq/tegra30-devfreq.c
+++ b/drivers/devfreq/tegra30-devfreq.c
@@ -19,6 +19,8 @@
#include <linux/reset.h>
#include <linux/workqueue.h>

+#include <soc/tegra/fuse.h>
+
#include "governor.h"

#define ACTMON_GLB_STATUS 0x0
@@ -155,6 +157,7 @@ struct tegra_devfreq_device {

struct tegra_devfreq {
struct devfreq *devfreq;
+ struct opp_table *opp_table;

struct reset_control *reset;
struct clk *clock;
@@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
u32 flags)
{
- struct tegra_devfreq *tegra = dev_get_drvdata(dev);
- struct devfreq *devfreq = tegra->devfreq;
struct dev_pm_opp *opp;
- unsigned long rate;
- int err;
+ int ret;

opp = devfreq_recommended_opp(dev, freq, flags);
if (IS_ERR(opp)) {
dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
return PTR_ERR(opp);
}
- rate = dev_pm_opp_get_freq(opp);
- dev_pm_opp_put(opp);
-
- err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
- if (err)
- return err;
-
- err = clk_set_rate(tegra->emc_clock, 0);
- if (err)
- goto restore_min_rate;

- return 0;
-
-restore_min_rate:
- clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
+ ret = dev_pm_opp_set_bw(dev, opp);
+ dev_pm_opp_put(opp);

- return err;
+ return ret;
}

static int tegra_devfreq_get_dev_status(struct device *dev,
@@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev,
stat->private_data = tegra;

/* The below are to be used by the other governors */
- stat->current_frequency = cur_freq;
+ stat->current_frequency = cur_freq * KHZ;

actmon_dev = &tegra->devices[MCALL];

@@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq,
target_freq = max(target_freq, dev->target_freq);
}

- *freq = target_freq;
+ /*
+ * tegra-devfreq driver operates with KHz units, while OPP table
+ * entries use Hz units. Hence we need to convert the units for the
+ * devfreq core.
+ */
+ *freq = target_freq * KHZ;

return 0;
}
@@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {

static int tegra_devfreq_probe(struct platform_device *pdev)
{
+ u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
struct tegra_devfreq_device *dev;
struct tegra_devfreq *tegra;
struct devfreq *devfreq;
@@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
long rate;
int err;

+ /* legacy device-trees don't have OPP table and must be updated */
+ if (!device_property_present(&pdev->dev, "operating-points-v2")) {
+ dev_err(&pdev->dev,
+ "OPP table not found, please update your device tree\n");
+ return -ENODEV;
+ }
+
tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
if (!tegra)
return -ENOMEM;
@@ -822,11 +823,25 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
return err;
}

+ tegra->opp_table = dev_pm_opp_set_supported_hw(&pdev->dev,
+ &hw_version, 1);
+ err = PTR_ERR_OR_ZERO(tegra->opp_table);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to set supported HW: %d\n", err);
+ return err;
+ }
+
+ err = dev_pm_opp_of_add_table(&pdev->dev);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to add OPP table: %d\n", err);
+ goto put_hw;
+ }
+
err = clk_prepare_enable(tegra->clock);
if (err) {
dev_err(&pdev->dev,
"Failed to prepare and enable ACTMON clock\n");
- return err;
+ goto remove_table;
}

err = reset_control_reset(tegra->reset);
@@ -850,23 +865,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
dev->regs = tegra->regs + dev->config->offset;
}

- for (rate = 0; rate <= tegra->max_freq * KHZ; rate++) {
- rate = clk_round_rate(tegra->emc_clock, rate);
-
- if (rate < 0) {
- dev_err(&pdev->dev,
- "Failed to round clock rate: %ld\n", rate);
- err = rate;
- goto remove_opps;
- }
-
- err = dev_pm_opp_add(&pdev->dev, rate / KHZ, 0);
- if (err) {
- dev_err(&pdev->dev, "Failed to add OPP: %d\n", err);
- goto remove_opps;
- }
- }
-
platform_set_drvdata(pdev, tegra);

tegra->clk_rate_change_nb.notifier_call = tegra_actmon_clk_notify_cb;
@@ -882,7 +880,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
}

tegra_devfreq_profile.initial_freq = clk_get_rate(tegra->emc_clock);
- tegra_devfreq_profile.initial_freq /= KHZ;

devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile,
"tegra_actmon", NULL);
@@ -902,6 +899,10 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
reset_control_reset(tegra->reset);
disable_clk:
clk_disable_unprepare(tegra->clock);
+remove_table:
+ dev_pm_opp_of_remove_table(&pdev->dev);
+put_hw:
+ dev_pm_opp_put_supported_hw(tegra->opp_table);

return err;
}
@@ -913,11 +914,12 @@ static int tegra_devfreq_remove(struct platform_device *pdev)
devfreq_remove_device(tegra->devfreq);
devfreq_remove_governor(&tegra_devfreq_governor);

- dev_pm_opp_remove_all_dynamic(&pdev->dev);
-
reset_control_reset(tegra->reset);
clk_disable_unprepare(tegra->clock);

+ dev_pm_opp_of_remove_table(&pdev->dev);
+ dev_pm_opp_put_supported_hw(tegra->opp_table);
+
return 0;
}

--
2.29.2

2020-11-15 21:41:45

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 05/17] drm/tegra: dc: Support memory bandwidth management

Display controller (DC) performs isochronous memory transfers, and thus,
has a requirement for a minimum memory bandwidth that shall be fulfilled,
otherwise framebuffer data can't be fetched fast enough and this results
in a DC's data-FIFO underflow that follows by a visual corruption.

The Memory Controller drivers provide facility for memory bandwidth
management via interconnect API. Let's wire up the interconnect API
support to the DC driver in order to fix the distorted display output
on T30 Ouya, T124 TK1 and other Tegra devices.

Tested-by: Peter Geis <[email protected]>
Tested-by: Nicolas Chauvet <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/tegra/Kconfig | 1 +
drivers/gpu/drm/tegra/dc.c | 349 ++++++++++++++++++++++++++++++++++
drivers/gpu/drm/tegra/dc.h | 14 ++
drivers/gpu/drm/tegra/drm.c | 14 ++
drivers/gpu/drm/tegra/hub.c | 3 +
drivers/gpu/drm/tegra/plane.c | 121 ++++++++++++
drivers/gpu/drm/tegra/plane.h | 15 ++
7 files changed, 517 insertions(+)

diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig
index 5043dcaf1cf9..1650a448eabd 100644
--- a/drivers/gpu/drm/tegra/Kconfig
+++ b/drivers/gpu/drm/tegra/Kconfig
@@ -9,6 +9,7 @@ config DRM_TEGRA
select DRM_MIPI_DSI
select DRM_PANEL
select TEGRA_HOST1X
+ select INTERCONNECT
select IOMMU_IOVA
select CEC_CORE if CEC_NOTIFIER
help
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 85dd7131553a..5c587cfd1bb2 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -8,6 +8,7 @@
#include <linux/debugfs.h>
#include <linux/delay.h>
#include <linux/iommu.h>
+#include <linux/interconnect.h>
#include <linux/module.h>
#include <linux/of_device.h>
#include <linux/pm_runtime.h>
@@ -616,6 +617,9 @@ static int tegra_plane_atomic_check(struct drm_plane *plane,
struct tegra_dc *dc = to_tegra_dc(state->crtc);
int err;

+ plane_state->peak_memory_bandwidth = 0;
+ plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!state->crtc)
return 0;
@@ -802,6 +806,12 @@ static struct drm_plane *tegra_primary_plane_create(struct drm_device *drm,
formats = dc->soc->primary_formats;
modifiers = dc->soc->modifiers;

+ err = tegra_plane_interconnect_init(plane);
+ if (err) {
+ kfree(plane);
+ return ERR_PTR(err);
+ }
+
err = drm_universal_plane_init(drm, &plane->base, possible_crtcs,
&tegra_plane_funcs, formats,
num_formats, modifiers, type, NULL);
@@ -833,9 +843,13 @@ static const u32 tegra_cursor_plane_formats[] = {
static int tegra_cursor_atomic_check(struct drm_plane *plane,
struct drm_plane_state *state)
{
+ struct tegra_plane_state *plane_state = to_tegra_plane_state(state);
struct tegra_plane *tegra = to_tegra_plane(plane);
int err;

+ plane_state->peak_memory_bandwidth = 0;
+ plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!state->crtc)
return 0;
@@ -973,6 +987,12 @@ static struct drm_plane *tegra_dc_cursor_plane_create(struct drm_device *drm,
num_formats = ARRAY_SIZE(tegra_cursor_plane_formats);
formats = tegra_cursor_plane_formats;

+ err = tegra_plane_interconnect_init(plane);
+ if (err) {
+ kfree(plane);
+ return ERR_PTR(err);
+ }
+
err = drm_universal_plane_init(drm, &plane->base, possible_crtcs,
&tegra_plane_funcs, formats,
num_formats, NULL,
@@ -1087,6 +1107,12 @@ static struct drm_plane *tegra_dc_overlay_plane_create(struct drm_device *drm,
num_formats = dc->soc->num_overlay_formats;
formats = dc->soc->overlay_formats;

+ err = tegra_plane_interconnect_init(plane);
+ if (err) {
+ kfree(plane);
+ return ERR_PTR(err);
+ }
+
if (!cursor)
type = DRM_PLANE_TYPE_OVERLAY;
else
@@ -1204,6 +1230,7 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
{
struct tegra_dc_state *state = to_dc_state(crtc->state);
struct tegra_dc_state *copy;
+ unsigned int i;

copy = kmalloc(sizeof(*copy), GFP_KERNEL);
if (!copy)
@@ -1215,6 +1242,9 @@ tegra_crtc_atomic_duplicate_state(struct drm_crtc *crtc)
copy->div = state->div;
copy->planes = state->planes;

+ for (i = 0; i < ARRAY_SIZE(state->plane_peak_bw); i++)
+ copy->plane_peak_bw[i] = state->plane_peak_bw[i];
+
return &copy->base;
}

@@ -1741,6 +1771,106 @@ static int tegra_dc_wait_idle(struct tegra_dc *dc, unsigned long timeout)
return -ETIMEDOUT;
}

+static void
+tegra_crtc_update_memory_bandwidth(struct drm_crtc *crtc,
+ struct drm_atomic_state *state,
+ bool prepare_bandwidth_transition)
+{
+ const struct tegra_plane_state *old_tegra_state, *new_tegra_state;
+ const struct tegra_dc_state *old_dc_state, *new_dc_state;
+ u32 i, new_avg_bw, old_avg_bw, new_peak_bw, old_peak_bw;
+ const struct drm_plane_state *old_plane_state;
+ const struct drm_crtc_state *old_crtc_state;
+ struct tegra_dc_window window, old_window;
+ struct tegra_dc *dc = to_tegra_dc(crtc);
+ struct tegra_plane *tegra;
+ struct drm_plane *plane;
+
+ if (dc->soc->has_nvdisplay)
+ return;
+
+ old_crtc_state = drm_atomic_get_old_crtc_state(state, crtc);
+ old_dc_state = to_const_dc_state(old_crtc_state);
+ new_dc_state = to_const_dc_state(crtc->state);
+
+ if (!crtc->state->active) {
+ if (!old_crtc_state->active)
+ return;
+
+ /*
+ * When CRTC is disabled on DPMS, the state of attached planes
+ * is kept unchanged. Hence we need to enforce removal of the
+ * bandwidths from the ICC paths.
+ */
+ drm_atomic_crtc_for_each_plane(plane, crtc) {
+ tegra = to_tegra_plane(plane);
+
+ icc_set_bw(tegra->icc_mem, 0, 0);
+ icc_set_bw(tegra->icc_mem_vfilter, 0, 0);
+ }
+
+ return;
+ }
+
+ for_each_old_plane_in_state(old_crtc_state->state, plane,
+ old_plane_state, i) {
+ old_tegra_state = to_const_tegra_plane_state(old_plane_state);
+ new_tegra_state = to_const_tegra_plane_state(plane->state);
+ tegra = to_tegra_plane(plane);
+
+ /*
+ * We're iterating over the global atomic state and it contains
+ * planes from another CRTC, hence we need to filter out the
+ * planes unrelated to this CRTC.
+ */
+ if (tegra->dc != dc)
+ continue;
+
+ new_avg_bw = new_tegra_state->avg_memory_bandwidth;
+ old_avg_bw = old_tegra_state->avg_memory_bandwidth;
+
+ new_peak_bw = new_dc_state->plane_peak_bw[tegra->index];
+ old_peak_bw = old_dc_state->plane_peak_bw[tegra->index];
+
+ /*
+ * See the comment related to !crtc->state->active above,
+ * which explains why bandwidths need to be updated when
+ * CRTC is turning ON.
+ */
+ if (new_avg_bw == old_avg_bw && new_peak_bw == old_peak_bw &&
+ old_crtc_state->active)
+ continue;
+
+ window.src.h = drm_rect_height(&plane->state->src) >> 16;
+ window.dst.h = drm_rect_height(&plane->state->dst);
+
+ old_window.src.h = drm_rect_height(&old_plane_state->src) >> 16;
+ old_window.dst.h = drm_rect_height(&old_plane_state->dst);
+
+ /*
+ * During the preparation phase (atomic_begin), the memory
+ * freq should go high before the DC changes are committed
+ * if bandwidth requirement goes up, otherwise memory freq
+ * should to stay high if BW requirement goes down. The
+ * opposite applies to the completion phase (post_commit).
+ */
+ if (prepare_bandwidth_transition) {
+ new_avg_bw = max(old_avg_bw, new_avg_bw);
+ new_peak_bw = max(old_peak_bw, new_peak_bw);
+
+ if (tegra_plane_use_vertical_filtering(tegra, &old_window))
+ window = old_window;
+ }
+
+ icc_set_bw(tegra->icc_mem, new_avg_bw, new_peak_bw);
+
+ if (tegra_plane_use_vertical_filtering(tegra, &window))
+ icc_set_bw(tegra->icc_mem_vfilter, new_avg_bw, new_peak_bw);
+ else
+ icc_set_bw(tegra->icc_mem_vfilter, 0, 0);
+ }
+}
+
static void tegra_crtc_atomic_disable(struct drm_crtc *crtc,
struct drm_atomic_state *state)
{
@@ -1922,6 +2052,8 @@ static void tegra_crtc_atomic_begin(struct drm_crtc *crtc,
{
unsigned long flags;

+ tegra_crtc_update_memory_bandwidth(crtc, state, true);
+
if (crtc->state->event) {
spin_lock_irqsave(&crtc->dev->event_lock, flags);

@@ -1954,7 +2086,212 @@ static void tegra_crtc_atomic_flush(struct drm_crtc *crtc,
value = tegra_dc_readl(dc, DC_CMD_STATE_CONTROL);
}

+static bool tegra_plane_is_cursor(const struct drm_plane_state *state)
+{
+ const struct tegra_dc_soc_info *soc = to_tegra_dc(state->crtc)->soc;
+ const struct drm_format_info *fmt = state->fb->format;
+ unsigned int src_w = drm_rect_width(&state->src) >> 16;
+ unsigned int dst_w = drm_rect_width(&state->dst);
+
+ if (state->plane->type != DRM_PLANE_TYPE_CURSOR)
+ return false;
+
+ if (soc->supports_cursor)
+ return true;
+
+ if (src_w != dst_w || fmt->num_planes != 1 || src_w * fmt->cpp[0] > 256)
+ return false;
+
+ return true;
+}
+
+static unsigned long
+tegra_plane_overlap_mask(struct drm_crtc_state *state,
+ const struct drm_plane_state *plane_state)
+{
+ const struct drm_plane_state *other_state;
+ const struct tegra_plane *tegra;
+ unsigned long overlap_mask = 0;
+ struct drm_plane *plane;
+ struct drm_rect rect;
+
+ if (!plane_state->visible || !plane_state->fb)
+ return 0;
+
+ drm_atomic_crtc_state_for_each_plane_state(plane, other_state, state) {
+ rect = plane_state->dst;
+
+ tegra = to_tegra_plane(other_state->plane);
+
+ if (!other_state->visible || !other_state->fb)
+ continue;
+
+ /*
+ * Ignore cursor plane overlaps because it's not practical to
+ * assume that it contributes to the bandwidth in overlapping
+ * area if window width is small.
+ */
+ if (tegra_plane_is_cursor(other_state))
+ continue;
+
+ if (drm_rect_intersect(&rect, &other_state->dst))
+ overlap_mask |= BIT(tegra->index);
+ }
+
+ /*
+ * Data prefetch FIFO will easily help to overcome temporal memory
+ * pressure if other plane overlaps with the cursor plane.
+ */
+ if (tegra_plane_is_cursor(plane_state) && overlap_mask)
+ return 0;
+
+ return overlap_mask;
+}
+
+static struct drm_plane *
+tegra_crtc_get_plane_by_index(struct drm_crtc *crtc, unsigned int index)
+{
+ struct drm_plane *plane;
+
+ drm_atomic_crtc_for_each_plane(plane, crtc) {
+ if (to_tegra_plane(plane)->index == index)
+ return plane;
+ }
+
+ return NULL;
+}
+
+static int tegra_crtc_check_bandwidth_state(struct drm_crtc *crtc,
+ struct drm_atomic_state *state)
+{
+ ulong overlap_mask[TEGRA_DC_LEGACY_PLANES_NUM] = {}, mask;
+ u32 plane_peak_bw[TEGRA_DC_LEGACY_PLANES_NUM] = {};
+ bool all_planes_overlap_simultaneously = true;
+ const struct tegra_plane_state *tegra_state;
+ const struct drm_plane_state *plane_state;
+ const struct tegra_dc_state *old_dc_state;
+ struct tegra_dc *dc = to_tegra_dc(crtc);
+ const struct drm_crtc_state *old_state;
+ struct tegra_dc_state *new_dc_state;
+ struct drm_crtc_state *new_state;
+ struct tegra_plane *tegra;
+ struct drm_plane *plane;
+ u32 i, k, overlap_bw;
+
+ /*
+ * The nv-display uses shared planes. The algorithm below assumes
+ * maximum 3 planes per-CRTC, this assumption isn't applicable to
+ * the nv-display. Note that T124 support has additional windows,
+ * but currently they aren't supported by the driver.
+ */
+ if (dc->soc->has_nvdisplay)
+ return 0;
+
+ new_state = drm_atomic_get_new_crtc_state(state, crtc);
+ new_dc_state = to_dc_state(new_state);
+
+ /*
+ * For overlapping planes pixel's data is fetched for each plane at
+ * the same time, hence bandwidths are accumulated in this case.
+ * This needs to be taken into account for calculating total bandwidth
+ * consumed by all planes.
+ *
+ * Here we get the overlapping state of each plane, which is a
+ * bitmask of plane indices telling with what planes there is an
+ * overlap. Note that bitmask[plane] includes BIT(plane) in order
+ * to make further code nicer and simpler.
+ */
+ drm_atomic_crtc_state_for_each_plane_state(plane, plane_state, new_state) {
+ tegra_state = to_const_tegra_plane_state(plane_state);
+ tegra = to_tegra_plane(plane);
+
+ plane_peak_bw[tegra->index] = tegra_state->peak_memory_bandwidth;
+ mask = tegra_plane_overlap_mask(new_state, plane_state);
+ overlap_mask[tegra->index] = mask;
+
+ if (hweight_long(mask) != 3)
+ all_planes_overlap_simultaneously = false;
+ }
+
+ old_state = drm_atomic_get_old_crtc_state(state, crtc);
+ old_dc_state = to_const_dc_state(old_state);
+
+ /*
+ * Then we calculate maximum bandwidth of each plane state.
+ * The bandwidth includes the plane BW + BW of the "simultaneously"
+ * overlapping planes, where "simultaneously" means areas where DC
+ * fetches from the planes simultaneously during of scan-out process.
+ *
+ * For example, if plane A overlaps with planes B and C, but B and C
+ * don't overlap, then the peak bandwidth will be either in area where
+ * A-and-B or A-and-C planes overlap.
+ *
+ * The plane_peak_bw[] contains peak memory bandwidth values of
+ * each plane, this information is needed by interconnect provider
+ * in order to set up latency allowness based on the peak BW, see
+ * tegra_crtc_update_memory_bandwidth().
+ */
+ for (i = 0; i < ARRAY_SIZE(plane_peak_bw); i++) {
+ overlap_bw = 0;
+
+ for_each_set_bit(k, &overlap_mask[i], 3) {
+ if (k == i)
+ continue;
+
+ if (all_planes_overlap_simultaneously)
+ overlap_bw += plane_peak_bw[k];
+ else
+ overlap_bw = max(overlap_bw, plane_peak_bw[k]);
+ }
+
+ new_dc_state->plane_peak_bw[i] = plane_peak_bw[i] + overlap_bw;
+
+ /*
+ * If plane's peak bandwidth changed (for example plane isn't
+ * overlapped anymore) and plane isn't in the atomic state,
+ * then add plane to the state in order to have the bandwidth
+ * updated.
+ */
+ if (old_dc_state->plane_peak_bw[i] !=
+ new_dc_state->plane_peak_bw[i]) {
+ plane = tegra_crtc_get_plane_by_index(crtc, i);
+ if (!plane)
+ continue;
+
+ plane_state = drm_atomic_get_plane_state(state, plane);
+ if (IS_ERR(plane_state))
+ return PTR_ERR(plane_state);
+ }
+ }
+
+ return 0;
+}
+
+static int tegra_crtc_atomic_check(struct drm_crtc *crtc,
+ struct drm_atomic_state *state)
+{
+ int err;
+
+ err = tegra_crtc_check_bandwidth_state(crtc, state);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+void tegra_crtc_atomic_post_commit(struct drm_crtc *crtc,
+ struct drm_atomic_state *state)
+{
+ /*
+ * Display bandwidth is allowed to go down only once hardware state
+ * is known to be armed, i.e. state was committed and VBLANK event
+ * received.
+ */
+ tegra_crtc_update_memory_bandwidth(crtc, state, false);
+}
+
static const struct drm_crtc_helper_funcs tegra_crtc_helper_funcs = {
+ .atomic_check = tegra_crtc_atomic_check,
.atomic_begin = tegra_crtc_atomic_begin,
.atomic_flush = tegra_crtc_atomic_flush,
.atomic_enable = tegra_crtc_atomic_enable,
@@ -2245,7 +2582,9 @@ static const struct tegra_dc_soc_info tegra20_dc_soc_info = {
.overlay_formats = tegra20_overlay_formats,
.modifiers = tegra20_modifiers,
.has_win_a_without_filters = true,
+ .has_win_b_vfilter_mem_client = true,
.has_win_c_without_vert_filter = true,
+ .plane_tiled_memory_bandwidth_x2 = false,
};

static const struct tegra_dc_soc_info tegra30_dc_soc_info = {
@@ -2264,7 +2603,9 @@ static const struct tegra_dc_soc_info tegra30_dc_soc_info = {
.overlay_formats = tegra20_overlay_formats,
.modifiers = tegra20_modifiers,
.has_win_a_without_filters = false,
+ .has_win_b_vfilter_mem_client = true,
.has_win_c_without_vert_filter = false,
+ .plane_tiled_memory_bandwidth_x2 = true,
};

static const struct tegra_dc_soc_info tegra114_dc_soc_info = {
@@ -2283,7 +2624,9 @@ static const struct tegra_dc_soc_info tegra114_dc_soc_info = {
.overlay_formats = tegra114_overlay_formats,
.modifiers = tegra20_modifiers,
.has_win_a_without_filters = false,
+ .has_win_b_vfilter_mem_client = false,
.has_win_c_without_vert_filter = false,
+ .plane_tiled_memory_bandwidth_x2 = true,
};

static const struct tegra_dc_soc_info tegra124_dc_soc_info = {
@@ -2302,7 +2645,9 @@ static const struct tegra_dc_soc_info tegra124_dc_soc_info = {
.overlay_formats = tegra124_overlay_formats,
.modifiers = tegra124_modifiers,
.has_win_a_without_filters = false,
+ .has_win_b_vfilter_mem_client = false,
.has_win_c_without_vert_filter = false,
+ .plane_tiled_memory_bandwidth_x2 = false,
};

static const struct tegra_dc_soc_info tegra210_dc_soc_info = {
@@ -2321,7 +2666,9 @@ static const struct tegra_dc_soc_info tegra210_dc_soc_info = {
.overlay_formats = tegra114_overlay_formats,
.modifiers = tegra124_modifiers,
.has_win_a_without_filters = false,
+ .has_win_b_vfilter_mem_client = false,
.has_win_c_without_vert_filter = false,
+ .plane_tiled_memory_bandwidth_x2 = false,
};

static const struct tegra_windowgroup_soc tegra186_dc_wgrps[] = {
@@ -2370,6 +2717,7 @@ static const struct tegra_dc_soc_info tegra186_dc_soc_info = {
.has_nvdisplay = true,
.wgrps = tegra186_dc_wgrps,
.num_wgrps = ARRAY_SIZE(tegra186_dc_wgrps),
+ .plane_tiled_memory_bandwidth_x2 = false,
};

static const struct tegra_windowgroup_soc tegra194_dc_wgrps[] = {
@@ -2418,6 +2766,7 @@ static const struct tegra_dc_soc_info tegra194_dc_soc_info = {
.has_nvdisplay = true,
.wgrps = tegra194_dc_wgrps,
.num_wgrps = ARRAY_SIZE(tegra194_dc_wgrps),
+ .plane_tiled_memory_bandwidth_x2 = false,
};

static const struct of_device_id tegra_dc_of_match[] = {
diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h
index 051d03dcb9b0..0d7bdf66a1ec 100644
--- a/drivers/gpu/drm/tegra/dc.h
+++ b/drivers/gpu/drm/tegra/dc.h
@@ -15,6 +15,8 @@

struct tegra_output;

+#define TEGRA_DC_LEGACY_PLANES_NUM 6
+
struct tegra_dc_state {
struct drm_crtc_state base;

@@ -23,6 +25,8 @@ struct tegra_dc_state {
unsigned int div;

u32 planes;
+
+ unsigned long plane_peak_bw[TEGRA_DC_LEGACY_PLANES_NUM];
};

static inline struct tegra_dc_state *to_dc_state(struct drm_crtc_state *state)
@@ -33,6 +37,12 @@ static inline struct tegra_dc_state *to_dc_state(struct drm_crtc_state *state)
return NULL;
}

+static inline const struct tegra_dc_state *
+to_const_dc_state(const struct drm_crtc_state *state)
+{
+ return to_dc_state((struct drm_crtc_state *)state);
+}
+
struct tegra_dc_stats {
unsigned long frames;
unsigned long vblank;
@@ -65,7 +75,9 @@ struct tegra_dc_soc_info {
unsigned int num_overlay_formats;
const u64 *modifiers;
bool has_win_a_without_filters;
+ bool has_win_b_vfilter_mem_client;
bool has_win_c_without_vert_filter;
+ unsigned int plane_tiled_memory_bandwidth_x2;
};

struct tegra_dc {
@@ -151,6 +163,8 @@ int tegra_dc_state_setup_clock(struct tegra_dc *dc,
struct drm_crtc_state *crtc_state,
struct clk *clk, unsigned long pclk,
unsigned int div);
+void tegra_crtc_atomic_post_commit(struct drm_crtc *crtc,
+ struct drm_atomic_state *state);

/* from rgb.c */
int tegra_dc_rgb_probe(struct tegra_dc *dc);
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index e45c8414e2a3..1472042da2a7 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -20,6 +20,7 @@
#include <drm/drm_prime.h>
#include <drm/drm_vblank.h>

+#include "dc.h"
#include "drm.h"
#include "gem.h"

@@ -59,6 +60,17 @@ static const struct drm_mode_config_funcs tegra_drm_mode_config_funcs = {
.atomic_commit = drm_atomic_helper_commit,
};

+static void tegra_atomic_post_commit(struct drm_device *drm,
+ struct drm_atomic_state *old_state)
+{
+ struct drm_crtc_state *old_crtc_state __maybe_unused;
+ struct drm_crtc *crtc;
+ unsigned int i;
+
+ for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i)
+ tegra_crtc_atomic_post_commit(crtc, old_state);
+}
+
static void tegra_atomic_commit_tail(struct drm_atomic_state *old_state)
{
struct drm_device *drm = old_state->dev;
@@ -75,6 +87,8 @@ static void tegra_atomic_commit_tail(struct drm_atomic_state *old_state)
} else {
drm_atomic_helper_commit_tail_rpm(old_state);
}
+
+ tegra_atomic_post_commit(drm, old_state);
}

static const struct drm_mode_config_helper_funcs
diff --git a/drivers/gpu/drm/tegra/hub.c b/drivers/gpu/drm/tegra/hub.c
index 22a03f7ffdc1..4fa338dc7eb2 100644
--- a/drivers/gpu/drm/tegra/hub.c
+++ b/drivers/gpu/drm/tegra/hub.c
@@ -344,6 +344,9 @@ static int tegra_shared_plane_atomic_check(struct drm_plane *plane,
struct tegra_dc *dc = to_tegra_dc(state->crtc);
int err;

+ plane_state->peak_memory_bandwidth = 0;
+ plane_state->avg_memory_bandwidth = 0;
+
/* no need for further checks if the plane is being disabled */
if (!state->crtc || !state->fb)
return 0;
diff --git a/drivers/gpu/drm/tegra/plane.c b/drivers/gpu/drm/tegra/plane.c
index 539d14935728..1e589c5af143 100644
--- a/drivers/gpu/drm/tegra/plane.c
+++ b/drivers/gpu/drm/tegra/plane.c
@@ -4,6 +4,7 @@
*/

#include <linux/iommu.h>
+#include <linux/interconnect.h>

#include <drm/drm_atomic.h>
#include <drm/drm_atomic_helper.h>
@@ -64,6 +65,8 @@ tegra_plane_atomic_duplicate_state(struct drm_plane *plane)
copy->reflect_x = state->reflect_x;
copy->reflect_y = state->reflect_y;
copy->opaque = state->opaque;
+ copy->peak_memory_bandwidth = state->peak_memory_bandwidth;
+ copy->avg_memory_bandwidth = state->avg_memory_bandwidth;

for (i = 0; i < 2; i++)
copy->blending[i] = state->blending[i];
@@ -212,6 +215,87 @@ void tegra_plane_cleanup_fb(struct drm_plane *plane,
tegra_dc_unpin(dc, to_tegra_plane_state(state));
}

+static int tegra_plane_check_memory_bandwidth(struct drm_plane_state *state)
+{
+ struct tegra_plane_state *tegra_state = to_tegra_plane_state(state);
+ unsigned int i, bpp, bpp_plane, dst_w, src_w, src_h, mul;
+ const struct tegra_dc_soc_info *soc;
+ const struct drm_format_info *fmt;
+ struct drm_crtc_state *crtc_state;
+ u32 avg_bandwidth, peak_bandwidth;
+
+ if (!state->visible)
+ return 0;
+
+ crtc_state = drm_atomic_get_new_crtc_state(state->state, state->crtc);
+ if (!crtc_state)
+ return -EINVAL;
+
+ src_w = drm_rect_width(&state->src) >> 16;
+ src_h = drm_rect_height(&state->src) >> 16;
+ dst_w = drm_rect_width(&state->dst);
+
+ fmt = state->fb->format;
+ soc = to_tegra_dc(state->crtc)->soc;
+
+ /*
+ * Note that real memory bandwidth vary depending on format and
+ * memory layout, we are not taking that into account because small
+ * estimation error isn't important since bandwidth is rounded up
+ * anyway.
+ */
+ for (i = 0, bpp = 0; i < fmt->num_planes; i++) {
+ bpp_plane = fmt->cpp[i] * 8;
+
+ /*
+ * Sub-sampling is relevant for chroma planes only and vertical
+ * readouts are not cached, hence only horizontal sub-sampling
+ * matters.
+ */
+ if (i > 0)
+ bpp_plane /= fmt->hsub;
+
+ bpp += bpp_plane;
+ }
+
+ /*
+ * Horizontal downscale takes extra bandwidth which roughly depends
+ * on the scaled width.
+ */
+ if (src_w > dst_w)
+ mul = (src_w - dst_w) * bpp / 2048 + 1;
+ else
+ mul = 1;
+
+ /* average bandwidth in bytes/s */
+ avg_bandwidth = src_w * src_h * bpp / 8 * mul;
+ avg_bandwidth *= drm_mode_vrefresh(&crtc_state->mode);
+
+ /* mode.clock in kHz, peak bandwidth in kbit/s */
+ peak_bandwidth = crtc_state->mode.clock * bpp * mul;
+
+ /* ICC bandwidth in kbyte/s */
+ peak_bandwidth = kbps_to_icc(peak_bandwidth);
+ avg_bandwidth = Bps_to_icc(avg_bandwidth);
+
+ /*
+ * Tegra30/114 Memory Controller can't interleave DC memory requests
+ * and DC uses 16-bytes atom for the tiled windows, while DDR3 uses 32
+ * bytes atom. Hence there is x2 memory overfetch for tiled framebuffer
+ * and DDR3 on older SoCs.
+ */
+ if (soc->plane_tiled_memory_bandwidth_x2 &&
+ tegra_state->tiling.mode == TEGRA_BO_TILING_MODE_TILED) {
+ peak_bandwidth *= 2;
+ avg_bandwidth *= 2;
+ }
+
+ tegra_state->peak_memory_bandwidth = peak_bandwidth;
+ tegra_state->avg_memory_bandwidth = avg_bandwidth;
+
+ return 0;
+}
+
int tegra_plane_state_add(struct tegra_plane *plane,
struct drm_plane_state *state)
{
@@ -230,6 +314,10 @@ int tegra_plane_state_add(struct tegra_plane *plane,
if (err < 0)
return err;

+ err = tegra_plane_check_memory_bandwidth(state);
+ if (err < 0)
+ return err;
+
tegra = to_dc_state(crtc_state);

tegra->planes |= WIN_A_ACT_REQ << plane->index;
@@ -595,3 +683,36 @@ int tegra_plane_setup_legacy_state(struct tegra_plane *tegra,

return 0;
}
+
+static const char * const tegra_plane_icc_names[] = {
+ "wina", "winb", "winc", "", "", "", "cursor",
+};
+
+int tegra_plane_interconnect_init(struct tegra_plane *plane)
+{
+ const char *icc_name = tegra_plane_icc_names[plane->index];
+ struct device *dev = plane->dc->dev;
+ struct tegra_dc *dc = plane->dc;
+ int err;
+
+ plane->icc_mem = devm_of_icc_get(dev, icc_name);
+ err = PTR_ERR_OR_ZERO(plane->icc_mem);
+ if (err) {
+ dev_err_probe(dev, err, "failed to get %s interconnect\n",
+ icc_name);
+ return err;
+ }
+
+ /* plane B on T20/30 has a dedicated memory client for a 6-tap vertical filter */
+ if (plane->index == 1 && dc->soc->has_win_b_vfilter_mem_client) {
+ plane->icc_mem_vfilter = devm_of_icc_get(dev, "winb-vfilter");
+ err = PTR_ERR_OR_ZERO(plane->icc_mem_vfilter);
+ if (err) {
+ dev_err_probe(dev, err, "failed to get %s interconnect\n",
+ "winb-vfilter");
+ return err;
+ }
+ }
+
+ return 0;
+}
diff --git a/drivers/gpu/drm/tegra/plane.h b/drivers/gpu/drm/tegra/plane.h
index c691dd79b27b..f2731aae7d01 100644
--- a/drivers/gpu/drm/tegra/plane.h
+++ b/drivers/gpu/drm/tegra/plane.h
@@ -8,6 +8,7 @@

#include <drm/drm_plane.h>

+struct icc_path;
struct tegra_bo;
struct tegra_dc;

@@ -16,6 +17,9 @@ struct tegra_plane {
struct tegra_dc *dc;
unsigned int offset;
unsigned int index;
+
+ struct icc_path *icc_mem;
+ struct icc_path *icc_mem_vfilter;
};

struct tegra_cursor {
@@ -52,6 +56,10 @@ struct tegra_plane_state {
/* used for legacy blending support only */
struct tegra_plane_legacy_blending_state blending[2];
bool opaque;
+
+ /* bandwidths are in ICC units, i.e. kbytes/sec */
+ u32 peak_memory_bandwidth;
+ u32 avg_memory_bandwidth;
};

static inline struct tegra_plane_state *
@@ -63,6 +71,12 @@ to_tegra_plane_state(struct drm_plane_state *state)
return NULL;
}

+static inline const struct tegra_plane_state *
+to_const_tegra_plane_state(const struct drm_plane_state *state)
+{
+ return to_tegra_plane_state((struct drm_plane_state *)state);
+}
+
extern const struct drm_plane_funcs tegra_plane_funcs;

int tegra_plane_prepare_fb(struct drm_plane *plane,
@@ -77,5 +91,6 @@ int tegra_plane_format(u32 fourcc, u32 *format, u32 *swap);
bool tegra_plane_format_is_yuv(unsigned int format, bool *planar);
int tegra_plane_setup_legacy_state(struct tegra_plane *tegra,
struct tegra_plane_state *state);
+int tegra_plane_interconnect_init(struct tegra_plane *plane);

#endif /* TEGRA_PLANE_H */
--
2.29.2

2020-11-15 22:54:57

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 11/17] ARM: tegra: Add interconnect properties to Tegra20 device-tree

Add interconnect properties to the Memory Controller, External Memory
Controller and the Display Controller nodes in order to describe hardware
interconnection.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
arch/arm/boot/dts/tegra20.dtsi | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 9347f7789245..2e1304493f7d 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -111,6 +111,17 @@ dc@54200000 {

nvidia,head = <0>;

+ interconnects = <&mc TEGRA20_MC_DISPLAY0A &emc>,
+ <&mc TEGRA20_MC_DISPLAY0B &emc>,
+ <&mc TEGRA20_MC_DISPLAY1B &emc>,
+ <&mc TEGRA20_MC_DISPLAY0C &emc>,
+ <&mc TEGRA20_MC_DISPLAYHC &emc>;
+ interconnect-names = "wina",
+ "winb",
+ "winb-vfilter",
+ "winc",
+ "cursor";
+
rgb {
status = "disabled";
};
@@ -128,6 +139,17 @@ dc@54240000 {

nvidia,head = <1>;

+ interconnects = <&mc TEGRA20_MC_DISPLAY0AB &emc>,
+ <&mc TEGRA20_MC_DISPLAY0BB &emc>,
+ <&mc TEGRA20_MC_DISPLAY1BB &emc>,
+ <&mc TEGRA20_MC_DISPLAY0CB &emc>,
+ <&mc TEGRA20_MC_DISPLAYHCB &emc>;
+ interconnect-names = "wina",
+ "winb",
+ "winb-vfilter",
+ "winc",
+ "cursor";
+
rgb {
status = "disabled";
};
@@ -630,15 +652,17 @@ mc: memory-controller@7000f000 {
interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
#reset-cells = <1>;
#iommu-cells = <0>;
+ #interconnect-cells = <1>;
};

- memory-controller@7000f400 {
+ emc: memory-controller@7000f400 {
compatible = "nvidia,tegra20-emc";
reg = <0x7000f400 0x400>;
interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&tegra_car TEGRA20_CLK_EMC>;
#address-cells = <1>;
#size-cells = <0>;
+ #interconnect-cells = <0>;
};

fuse@7000f800 {
--
2.29.2

2020-11-15 22:54:57

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 06/17] drm/tegra: dc: Extend debug stats with total number of events

It's useful to know the total number of underflow events and currently
the debug stats are getting reset each time CRTC is being disabled. Let's
account the overall number of events that doesn't get a reset.

Tested-by: Peter Geis <[email protected]>
Tested-by: Nicolas Chauvet <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/tegra/dc.c | 10 ++++++++++
drivers/gpu/drm/tegra/dc.h | 5 +++++
2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 5c587cfd1bb2..b6676f1fe358 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -1539,6 +1539,11 @@ static int tegra_dc_show_stats(struct seq_file *s, void *data)
seq_printf(s, "underflow: %lu\n", dc->stats.underflow);
seq_printf(s, "overflow: %lu\n", dc->stats.overflow);

+ seq_printf(s, "frames total: %lu\n", dc->stats.frames_total);
+ seq_printf(s, "vblank total: %lu\n", dc->stats.vblank_total);
+ seq_printf(s, "underflow total: %lu\n", dc->stats.underflow_total);
+ seq_printf(s, "overflow total: %lu\n", dc->stats.overflow_total);
+
return 0;
}

@@ -2310,6 +2315,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): frame end\n", __func__);
*/
+ dc->stats.frames_total++;
dc->stats.frames++;
}

@@ -2318,6 +2324,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
dev_dbg(dc->dev, "%s(): vertical blank\n", __func__);
*/
drm_crtc_handle_vblank(&dc->base);
+ dc->stats.vblank_total++;
dc->stats.vblank++;
}

@@ -2325,6 +2332,7 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): underflow\n", __func__);
*/
+ dc->stats.underflow_total++;
dc->stats.underflow++;
}

@@ -2332,11 +2340,13 @@ static irqreturn_t tegra_dc_irq(int irq, void *data)
/*
dev_dbg(dc->dev, "%s(): overflow\n", __func__);
*/
+ dc->stats.overflow_total++;
dc->stats.overflow++;
}

if (status & HEAD_UF_INT) {
dev_dbg_ratelimited(dc->dev, "%s(): head underflow\n", __func__);
+ dc->stats.underflow_total++;
dc->stats.underflow++;
}

diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h
index 0d7bdf66a1ec..ba4ed35139fb 100644
--- a/drivers/gpu/drm/tegra/dc.h
+++ b/drivers/gpu/drm/tegra/dc.h
@@ -48,6 +48,11 @@ struct tegra_dc_stats {
unsigned long vblank;
unsigned long underflow;
unsigned long overflow;
+
+ unsigned long frames_total;
+ unsigned long vblank_total;
+ unsigned long underflow_total;
+ unsigned long overflow_total;
};

struct tegra_windowgroup_soc {
--
2.29.2

2020-11-15 22:54:57

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 02/17] memory: tegra124-emc: Make driver modular

Add modularization support to the Tegra124 EMC driver, which now can be
compiled as a loadable kernel module.

Note that EMC clock must be registered at clk-init time, otherwise PLLM
will be disabled as unused clock at boot time if EMC driver is compiled
as a module. Hence add a prepare/complete callbacks. similarly to what is
done for the Tegra20/30 EMC drivers.

Tested-by: Nicolas Chauvet <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/clk/tegra/Kconfig | 3 ++
drivers/clk/tegra/Makefile | 2 +-
drivers/clk/tegra/clk-tegra124-emc.c | 41 ++++++++++++++++++++++++----
drivers/clk/tegra/clk-tegra124.c | 26 ++++++++++++++++--
drivers/clk/tegra/clk.h | 18 ++++++++----
drivers/memory/tegra/Kconfig | 3 +-
drivers/memory/tegra/tegra124-emc.c | 31 ++++++++++++++-------
include/linux/clk/tegra.h | 8 ++++++
include/soc/tegra/emc.h | 16 -----------
9 files changed, 106 insertions(+), 42 deletions(-)
delete mode 100644 include/soc/tegra/emc.h

diff --git a/drivers/clk/tegra/Kconfig b/drivers/clk/tegra/Kconfig
index deaa4605824c..90df619dc087 100644
--- a/drivers/clk/tegra/Kconfig
+++ b/drivers/clk/tegra/Kconfig
@@ -7,3 +7,6 @@ config TEGRA_CLK_DFLL
depends on ARCH_TEGRA_124_SOC || ARCH_TEGRA_210_SOC
select PM_OPP
def_bool y
+
+config TEGRA124_CLK_EMC
+ bool
diff --git a/drivers/clk/tegra/Makefile b/drivers/clk/tegra/Makefile
index eec2313fd37e..7b1816856eb5 100644
--- a/drivers/clk/tegra/Makefile
+++ b/drivers/clk/tegra/Makefile
@@ -22,7 +22,7 @@ obj-$(CONFIG_ARCH_TEGRA_3x_SOC) += clk-tegra20-emc.o
obj-$(CONFIG_ARCH_TEGRA_114_SOC) += clk-tegra114.o
obj-$(CONFIG_ARCH_TEGRA_124_SOC) += clk-tegra124.o
obj-$(CONFIG_TEGRA_CLK_DFLL) += clk-tegra124-dfll-fcpu.o
-obj-$(CONFIG_TEGRA124_EMC) += clk-tegra124-emc.o
+obj-$(CONFIG_TEGRA124_CLK_EMC) += clk-tegra124-emc.o
obj-$(CONFIG_ARCH_TEGRA_132_SOC) += clk-tegra124.o
obj-y += cvb.o
obj-$(CONFIG_ARCH_TEGRA_210_SOC) += clk-tegra210.o
diff --git a/drivers/clk/tegra/clk-tegra124-emc.c b/drivers/clk/tegra/clk-tegra124-emc.c
index 745f9faa98d8..bdf6f4a51617 100644
--- a/drivers/clk/tegra/clk-tegra124-emc.c
+++ b/drivers/clk/tegra/clk-tegra124-emc.c
@@ -11,7 +11,9 @@
#include <linux/clk-provider.h>
#include <linux/clk.h>
#include <linux/clkdev.h>
+#include <linux/clk/tegra.h>
#include <linux/delay.h>
+#include <linux/export.h>
#include <linux/io.h>
#include <linux/module.h>
#include <linux/of_address.h>
@@ -21,7 +23,6 @@
#include <linux/string.h>

#include <soc/tegra/fuse.h>
-#include <soc/tegra/emc.h>

#include "clk.h"

@@ -80,6 +81,9 @@ struct tegra_clk_emc {
int num_timings;
struct emc_timing *timings;
spinlock_t *lock;
+
+ tegra124_emc_prepare_timing_change_cb *prepare_timing_change;
+ tegra124_emc_complete_timing_change_cb *complete_timing_change;
};

/* Common clock framework callback implementations */
@@ -176,6 +180,9 @@ static struct tegra_emc *emc_ensure_emc_driver(struct tegra_clk_emc *tegra)
if (tegra->emc)
return tegra->emc;

+ if (!tegra->prepare_timing_change || !tegra->complete_timing_change)
+ return NULL;
+
if (!tegra->emc_node)
return NULL;

@@ -241,7 +248,7 @@ static int emc_set_timing(struct tegra_clk_emc *tegra,

div = timing->parent_rate / (timing->rate / 2) - 2;

- err = tegra_emc_prepare_timing_change(emc, timing->rate);
+ err = tegra->prepare_timing_change(emc, timing->rate);
if (err)
return err;

@@ -259,7 +266,7 @@ static int emc_set_timing(struct tegra_clk_emc *tegra,

spin_unlock_irqrestore(tegra->lock, flags);

- tegra_emc_complete_timing_change(emc, timing->rate);
+ tegra->complete_timing_change(emc, timing->rate);

clk_hw_reparent(&tegra->hw, __clk_get_hw(timing->parent));
clk_disable_unprepare(tegra->prev_parent);
@@ -473,8 +480,8 @@ static const struct clk_ops tegra_clk_emc_ops = {
.get_parent = emc_get_parent,
};

-struct clk *tegra_clk_register_emc(void __iomem *base, struct device_node *np,
- spinlock_t *lock)
+struct clk *tegra124_clk_register_emc(void __iomem *base, struct device_node *np,
+ spinlock_t *lock)
{
struct tegra_clk_emc *tegra;
struct clk_init_data init;
@@ -538,3 +545,27 @@ struct clk *tegra_clk_register_emc(void __iomem *base, struct device_node *np,

return clk;
};
+
+void tegra124_clk_set_emc_callbacks(tegra124_emc_prepare_timing_change_cb *prep_cb,
+ tegra124_emc_complete_timing_change_cb *complete_cb)
+{
+ struct clk *clk = __clk_lookup("emc");
+ struct tegra_clk_emc *tegra;
+ struct clk_hw *hw;
+
+ if (clk) {
+ hw = __clk_get_hw(clk);
+ tegra = container_of(hw, struct tegra_clk_emc, hw);
+
+ tegra->prepare_timing_change = prep_cb;
+ tegra->complete_timing_change = complete_cb;
+ }
+}
+EXPORT_SYMBOL_GPL(tegra124_clk_set_emc_callbacks);
+
+bool tegra124_clk_emc_driver_available(struct clk_hw *hw)
+{
+ struct tegra_clk_emc *tegra = container_of(hw, struct tegra_clk_emc, hw);
+
+ return tegra->prepare_timing_change && tegra->complete_timing_change;
+}
diff --git a/drivers/clk/tegra/clk-tegra124.c b/drivers/clk/tegra/clk-tegra124.c
index e931319dcc9d..934520aab6e3 100644
--- a/drivers/clk/tegra/clk-tegra124.c
+++ b/drivers/clk/tegra/clk-tegra124.c
@@ -1500,6 +1500,26 @@ static void __init tegra124_132_clock_init_pre(struct device_node *np)
writel(plld_base, clk_base + PLLD_BASE);
}

+static struct clk *tegra124_clk_src_onecell_get(struct of_phandle_args *clkspec,
+ void *data)
+{
+ struct clk_hw *hw;
+ struct clk *clk;
+
+ clk = of_clk_src_onecell_get(clkspec, data);
+ if (IS_ERR(clk))
+ return clk;
+
+ hw = __clk_get_hw(clk);
+
+ if (clkspec->args[0] == TEGRA124_CLK_EMC) {
+ if (!tegra124_clk_emc_driver_available(hw))
+ return ERR_PTR(-EPROBE_DEFER);
+ }
+
+ return clk;
+}
+
/**
* tegra124_132_clock_init_post - clock initialization postamble for T124/T132
* @np: struct device_node * of the DT node for the SoC CAR IP block
@@ -1516,10 +1536,10 @@ static void __init tegra124_132_clock_init_post(struct device_node *np)
&pll_x_params);
tegra_init_special_resets(1, tegra124_reset_assert,
tegra124_reset_deassert);
- tegra_add_of_provider(np, of_clk_src_onecell_get);
+ tegra_add_of_provider(np, tegra124_clk_src_onecell_get);

- clks[TEGRA124_CLK_EMC] = tegra_clk_register_emc(clk_base, np,
- &emc_lock);
+ clks[TEGRA124_CLK_EMC] = tegra124_clk_register_emc(clk_base, np,
+ &emc_lock);

tegra_register_devclks(devclks, ARRAY_SIZE(devclks));

diff --git a/drivers/clk/tegra/clk.h b/drivers/clk/tegra/clk.h
index 6b565f6b5f66..c3e36b5dcc75 100644
--- a/drivers/clk/tegra/clk.h
+++ b/drivers/clk/tegra/clk.h
@@ -881,16 +881,22 @@ void tegra_super_clk_gen5_init(void __iomem *clk_base,
void __iomem *pmc_base, struct tegra_clk *tegra_clks,
struct tegra_clk_pll_params *pll_params);

-#ifdef CONFIG_TEGRA124_EMC
-struct clk *tegra_clk_register_emc(void __iomem *base, struct device_node *np,
- spinlock_t *lock);
+#ifdef CONFIG_TEGRA124_CLK_EMC
+struct clk *tegra124_clk_register_emc(void __iomem *base, struct device_node *np,
+ spinlock_t *lock);
+bool tegra124_clk_emc_driver_available(struct clk_hw *emc_hw);
#else
-static inline struct clk *tegra_clk_register_emc(void __iomem *base,
- struct device_node *np,
- spinlock_t *lock)
+static inline struct clk *
+tegra124_clk_register_emc(void __iomem *base, struct device_node *np,
+ spinlock_t *lock)
{
return NULL;
}
+
+static inline bool tegra124_clk_emc_driver_available(struct clk_hw *emc_hw)
+{
+ return false;
+}
#endif

void tegra114_clock_tune_cpu_trimmers_high(void);
diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig
index ca7077a06f4c..f5b451403c58 100644
--- a/drivers/memory/tegra/Kconfig
+++ b/drivers/memory/tegra/Kconfig
@@ -32,9 +32,10 @@ config TEGRA30_EMC
external memory.

config TEGRA124_EMC
- bool "NVIDIA Tegra124 External Memory Controller driver"
+ tristate "NVIDIA Tegra124 External Memory Controller driver"
default y
depends on TEGRA_MC && ARCH_TEGRA_124_SOC
+ select TEGRA124_CLK_EMC
help
This driver is for the External Memory Controller (EMC) found on
Tegra124 chips. The EMC controls the external DRAM on the board.
diff --git a/drivers/memory/tegra/tegra124-emc.c b/drivers/memory/tegra/tegra124-emc.c
index ee8ee39e98ed..edfbf6d6d357 100644
--- a/drivers/memory/tegra/tegra124-emc.c
+++ b/drivers/memory/tegra/tegra124-emc.c
@@ -9,16 +9,17 @@
#include <linux/clk-provider.h>
#include <linux/clk.h>
#include <linux/clkdev.h>
+#include <linux/clk/tegra.h>
#include <linux/debugfs.h>
#include <linux/delay.h>
#include <linux/io.h>
+#include <linux/module.h>
#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <linux/platform_device.h>
#include <linux/sort.h>
#include <linux/string.h>

-#include <soc/tegra/emc.h>
#include <soc/tegra/fuse.h>
#include <soc/tegra/mc.h>

@@ -562,8 +563,8 @@ static struct emc_timing *tegra_emc_find_timing(struct tegra_emc *emc,
return timing;
}

-int tegra_emc_prepare_timing_change(struct tegra_emc *emc,
- unsigned long rate)
+static int tegra_emc_prepare_timing_change(struct tegra_emc *emc,
+ unsigned long rate)
{
struct emc_timing *timing = tegra_emc_find_timing(emc, rate);
struct emc_timing *last = &emc->last_timing;
@@ -790,8 +791,8 @@ int tegra_emc_prepare_timing_change(struct tegra_emc *emc,
return 0;
}

-void tegra_emc_complete_timing_change(struct tegra_emc *emc,
- unsigned long rate)
+static void tegra_emc_complete_timing_change(struct tegra_emc *emc,
+ unsigned long rate)
{
struct emc_timing *timing = tegra_emc_find_timing(emc, rate);
struct emc_timing *last = &emc->last_timing;
@@ -987,6 +988,7 @@ static const struct of_device_id tegra_emc_of_match[] = {
{ .compatible = "nvidia,tegra132-emc" },
{}
};
+MODULE_DEVICE_TABLE(of, tegra_emc_of_match);

static struct device_node *
tegra_emc_find_node_by_ram_code(struct device_node *node, u32 ram_code)
@@ -1226,9 +1228,19 @@ static int tegra_emc_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, emc);

+ tegra124_clk_set_emc_callbacks(tegra_emc_prepare_timing_change,
+ tegra_emc_complete_timing_change);
+
if (IS_ENABLED(CONFIG_DEBUG_FS))
emc_debugfs_init(&pdev->dev, emc);

+ /*
+ * Don't allow the kernel module to be unloaded. Unloading adds some
+ * extra complexity which doesn't really worth the effort in a case of
+ * this driver.
+ */
+ try_module_get(THIS_MODULE);
+
return 0;
};

@@ -1240,9 +1252,8 @@ static struct platform_driver tegra_emc_driver = {
.suppress_bind_attrs = true,
},
};
+module_platform_driver(tegra_emc_driver);

-static int tegra_emc_init(void)
-{
- return platform_driver_register(&tegra_emc_driver);
-}
-subsys_initcall(tegra_emc_init);
+MODULE_AUTHOR("Mikko Perttunen <[email protected]>");
+MODULE_DESCRIPTION("NVIDIA Tegra124 EMC driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/clk/tegra.h b/include/linux/clk/tegra.h
index 3f01d43f0598..eb016fc9cc0b 100644
--- a/include/linux/clk/tegra.h
+++ b/include/linux/clk/tegra.h
@@ -136,6 +136,7 @@ extern void tegra210_clk_emc_dll_update_setting(u32 emc_dll_src_value);
extern void tegra210_clk_emc_update_setting(u32 emc_src_value);

struct clk;
+struct tegra_emc;

typedef long (tegra20_clk_emc_round_cb)(unsigned long rate,
unsigned long min_rate,
@@ -146,6 +147,13 @@ void tegra20_clk_set_emc_round_callback(tegra20_clk_emc_round_cb *round_cb,
void *cb_arg);
int tegra20_clk_prepare_emc_mc_same_freq(struct clk *emc_clk, bool same);

+typedef int (tegra124_emc_prepare_timing_change_cb)(struct tegra_emc *emc,
+ unsigned long rate);
+typedef void (tegra124_emc_complete_timing_change_cb)(struct tegra_emc *emc,
+ unsigned long rate);
+void tegra124_clk_set_emc_callbacks(tegra124_emc_prepare_timing_change_cb *prep_cb,
+ tegra124_emc_complete_timing_change_cb *complete_cb);
+
struct tegra210_clk_emc_config {
unsigned long rate;
bool same_freq;
diff --git a/include/soc/tegra/emc.h b/include/soc/tegra/emc.h
deleted file mode 100644
index 05199a97ccf4..000000000000
--- a/include/soc/tegra/emc.h
+++ /dev/null
@@ -1,16 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (c) 2014 NVIDIA Corporation. All rights reserved.
- */
-
-#ifndef __SOC_TEGRA_EMC_H__
-#define __SOC_TEGRA_EMC_H__
-
-struct tegra_emc;
-
-int tegra_emc_prepare_timing_change(struct tegra_emc *emc,
- unsigned long rate);
-void tegra_emc_complete_timing_change(struct tegra_emc *emc,
- unsigned long rate);
-
-#endif /* __SOC_TEGRA_EMC_H__ */
--
2.29.2

2020-11-16 00:25:43

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 12/17] ARM: tegra: Add interconnect properties to Tegra30 device-tree

Add interconnect properties to the Memory Controller, External Memory
Controller and the Display Controller nodes in order to describe hardware
interconnection.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
arch/arm/boot/dts/tegra30.dtsi | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi
index aeae8c092d41..2caf6cc6f4b1 100644
--- a/arch/arm/boot/dts/tegra30.dtsi
+++ b/arch/arm/boot/dts/tegra30.dtsi
@@ -210,6 +210,17 @@ dc@54200000 {

nvidia,head = <0>;

+ interconnects = <&mc TEGRA30_MC_DISPLAY0A &emc>,
+ <&mc TEGRA30_MC_DISPLAY0B &emc>,
+ <&mc TEGRA30_MC_DISPLAY1B &emc>,
+ <&mc TEGRA30_MC_DISPLAY0C &emc>,
+ <&mc TEGRA30_MC_DISPLAYHC &emc>;
+ interconnect-names = "wina",
+ "winb",
+ "winb-vfilter",
+ "winc",
+ "cursor";
+
rgb {
status = "disabled";
};
@@ -229,6 +240,17 @@ dc@54240000 {

nvidia,head = <1>;

+ interconnects = <&mc TEGRA30_MC_DISPLAY0AB &emc>,
+ <&mc TEGRA30_MC_DISPLAY0BB &emc>,
+ <&mc TEGRA30_MC_DISPLAY1BB &emc>,
+ <&mc TEGRA30_MC_DISPLAY0CB &emc>,
+ <&mc TEGRA30_MC_DISPLAYHCB &emc>;
+ interconnect-names = "wina",
+ "winb",
+ "winb-vfilter",
+ "winc",
+ "cursor";
+
rgb {
status = "disabled";
};
@@ -748,15 +770,18 @@ mc: memory-controller@7000f000 {

#iommu-cells = <1>;
#reset-cells = <1>;
+ #interconnect-cells = <1>;
};

- memory-controller@7000f400 {
+ emc: memory-controller@7000f400 {
compatible = "nvidia,tegra30-emc";
reg = <0x7000f400 0x400>;
interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&tegra_car TEGRA30_CLK_EMC>;

nvidia,memory-controller = <&mc>;
+
+ #interconnect-cells = <0>;
};

fuse@7000f800 {
--
2.29.2

2020-11-16 00:26:52

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v9 10/17] ARM: tegra: Correct EMC registers size in Tegra20 device-tree

Fix the size of Tegra20 EMC registers, which should be twice bigger.

Acked-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
arch/arm/boot/dts/tegra20.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 72a4211a618f..9347f7789245 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -634,7 +634,7 @@ mc: memory-controller@7000f000 {

memory-controller@7000f400 {
compatible = "nvidia,tegra20-emc";
- reg = <0x7000f400 0x200>;
+ reg = <0x7000f400 0x400>;
interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&tegra_car TEGRA20_CLK_EMC>;
#address-cells = <1>;
--
2.29.2

2020-11-17 10:09:14

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH v9 07/17] PM / devfreq: tegra30: Support interconnect and OPPs from device-tree

On 16-11-20, 00:29, Dmitry Osipenko wrote:
> This patch moves ACTMON driver away from generating OPP table by itself,
> transitioning it to use the table which comes from device-tree. This
> change breaks compatibility with older device-trees in order to bring
> support for the interconnect framework to the driver. This is a mandatory
> change which needs to be done in order to implement interconnect-based
> memory DVFS. Users of legacy device-trees will get a message telling that
> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
>
> Tested-by: Peter Geis <[email protected]>
> Tested-by: Nicolas Chauvet <[email protected]>
> Acked-by: Chanwoo Choi <[email protected]>
> Signed-off-by: Dmitry Osipenko <[email protected]>
> ---
> drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++---------------
> 1 file changed, 44 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
> index 38cc0d014738..ed6d4469c8c7 100644
> --- a/drivers/devfreq/tegra30-devfreq.c
> +++ b/drivers/devfreq/tegra30-devfreq.c
> @@ -19,6 +19,8 @@
> #include <linux/reset.h>
> #include <linux/workqueue.h>
>
> +#include <soc/tegra/fuse.h>
> +
> #include "governor.h"
>
> #define ACTMON_GLB_STATUS 0x0
> @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
>
> struct tegra_devfreq {
> struct devfreq *devfreq;
> + struct opp_table *opp_table;
>
> struct reset_control *reset;
> struct clk *clock;
> @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
> static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
> u32 flags)
> {
> - struct tegra_devfreq *tegra = dev_get_drvdata(dev);
> - struct devfreq *devfreq = tegra->devfreq;
> struct dev_pm_opp *opp;
> - unsigned long rate;
> - int err;
> + int ret;
>
> opp = devfreq_recommended_opp(dev, freq, flags);
> if (IS_ERR(opp)) {
> dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
> return PTR_ERR(opp);
> }
> - rate = dev_pm_opp_get_freq(opp);
> - dev_pm_opp_put(opp);
> -
> - err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
> - if (err)
> - return err;
> -
> - err = clk_set_rate(tegra->emc_clock, 0);
> - if (err)
> - goto restore_min_rate;
>
> - return 0;
> -
> -restore_min_rate:
> - clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
> + ret = dev_pm_opp_set_bw(dev, opp);
> + dev_pm_opp_put(opp);
>
> - return err;
> + return ret;
> }
>
> static int tegra_devfreq_get_dev_status(struct device *dev,
> @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev,
> stat->private_data = tegra;
>
> /* The below are to be used by the other governors */
> - stat->current_frequency = cur_freq;
> + stat->current_frequency = cur_freq * KHZ;
>
> actmon_dev = &tegra->devices[MCALL];
>
> @@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq,
> target_freq = max(target_freq, dev->target_freq);
> }
>
> - *freq = target_freq;
> + /*
> + * tegra-devfreq driver operates with KHz units, while OPP table
> + * entries use Hz units. Hence we need to convert the units for the
> + * devfreq core.
> + */
> + *freq = target_freq * KHZ;
>
> return 0;
> }
> @@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {
>
> static int tegra_devfreq_probe(struct platform_device *pdev)
> {
> + u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
> struct tegra_devfreq_device *dev;
> struct tegra_devfreq *tegra;
> struct devfreq *devfreq;
> @@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
> long rate;
> int err;
>
> + /* legacy device-trees don't have OPP table and must be updated */
> + if (!device_property_present(&pdev->dev, "operating-points-v2")) {
> + dev_err(&pdev->dev,
> + "OPP table not found, please update your device tree\n");
> + return -ENODEV;
> + }
> +

You forgot to remove this ?

> tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
> if (!tegra)
> return -ENOMEM;
> @@ -822,11 +823,25 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
> return err;
> }
>
> + tegra->opp_table = dev_pm_opp_set_supported_hw(&pdev->dev,
> + &hw_version, 1);
> + err = PTR_ERR_OR_ZERO(tegra->opp_table);
> + if (err) {
> + dev_err(&pdev->dev, "Failed to set supported HW: %d\n", err);
> + return err;
> + }
> +
> + err = dev_pm_opp_of_add_table(&pdev->dev);
> + if (err) {
> + dev_err(&pdev->dev, "Failed to add OPP table: %d\n", err);
> + goto put_hw;
> + }
> +
> err = clk_prepare_enable(tegra->clock);
> if (err) {
> dev_err(&pdev->dev,
> "Failed to prepare and enable ACTMON clock\n");
> - return err;
> + goto remove_table;
> }
>
> err = reset_control_reset(tegra->reset);
> @@ -850,23 +865,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
> dev->regs = tegra->regs + dev->config->offset;
> }
>
> - for (rate = 0; rate <= tegra->max_freq * KHZ; rate++) {
> - rate = clk_round_rate(tegra->emc_clock, rate);
> -
> - if (rate < 0) {
> - dev_err(&pdev->dev,
> - "Failed to round clock rate: %ld\n", rate);
> - err = rate;
> - goto remove_opps;
> - }
> -
> - err = dev_pm_opp_add(&pdev->dev, rate / KHZ, 0);
> - if (err) {
> - dev_err(&pdev->dev, "Failed to add OPP: %d\n", err);
> - goto remove_opps;
> - }
> - }
> -
> platform_set_drvdata(pdev, tegra);
>
> tegra->clk_rate_change_nb.notifier_call = tegra_actmon_clk_notify_cb;
> @@ -882,7 +880,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
> }
>
> tegra_devfreq_profile.initial_freq = clk_get_rate(tegra->emc_clock);
> - tegra_devfreq_profile.initial_freq /= KHZ;
>
> devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile,
> "tegra_actmon", NULL);
> @@ -902,6 +899,10 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
> reset_control_reset(tegra->reset);
> disable_clk:
> clk_disable_unprepare(tegra->clock);
> +remove_table:
> + dev_pm_opp_of_remove_table(&pdev->dev);
> +put_hw:
> + dev_pm_opp_put_supported_hw(tegra->opp_table);
>
> return err;
> }
> @@ -913,11 +914,12 @@ static int tegra_devfreq_remove(struct platform_device *pdev)
> devfreq_remove_device(tegra->devfreq);
> devfreq_remove_governor(&tegra_devfreq_governor);
>
> - dev_pm_opp_remove_all_dynamic(&pdev->dev);
> -
> reset_control_reset(tegra->reset);
> clk_disable_unprepare(tegra->clock);
>
> + dev_pm_opp_of_remove_table(&pdev->dev);
> + dev_pm_opp_put_supported_hw(tegra->opp_table);
> +
> return 0;
> }
>
> --
> 2.29.2

--
viresh

2020-11-17 14:22:46

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v9 07/17] PM / devfreq: tegra30: Support interconnect and OPPs from device-tree

17.11.2020 13:07, Viresh Kumar пишет:
> On 16-11-20, 00:29, Dmitry Osipenko wrote:
>> This patch moves ACTMON driver away from generating OPP table by itself,
>> transitioning it to use the table which comes from device-tree. This
>> change breaks compatibility with older device-trees in order to bring
>> support for the interconnect framework to the driver. This is a mandatory
>> change which needs to be done in order to implement interconnect-based
>> memory DVFS. Users of legacy device-trees will get a message telling that
>> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
>> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
>>
>> Tested-by: Peter Geis <[email protected]>
>> Tested-by: Nicolas Chauvet <[email protected]>
>> Acked-by: Chanwoo Choi <[email protected]>
>> Signed-off-by: Dmitry Osipenko <[email protected]>
>> ---
>> drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++---------------
>> 1 file changed, 44 insertions(+), 42 deletions(-)
>>
>> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
>> index 38cc0d014738..ed6d4469c8c7 100644
>> --- a/drivers/devfreq/tegra30-devfreq.c
>> +++ b/drivers/devfreq/tegra30-devfreq.c
>> @@ -19,6 +19,8 @@
>> #include <linux/reset.h>
>> #include <linux/workqueue.h>
>>
>> +#include <soc/tegra/fuse.h>
>> +
>> #include "governor.h"
>>
>> #define ACTMON_GLB_STATUS 0x0
>> @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
>>
>> struct tegra_devfreq {
>> struct devfreq *devfreq;
>> + struct opp_table *opp_table;
>>
>> struct reset_control *reset;
>> struct clk *clock;
>> @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
>> static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>> u32 flags)
>> {
>> - struct tegra_devfreq *tegra = dev_get_drvdata(dev);
>> - struct devfreq *devfreq = tegra->devfreq;
>> struct dev_pm_opp *opp;
>> - unsigned long rate;
>> - int err;
>> + int ret;
>>
>> opp = devfreq_recommended_opp(dev, freq, flags);
>> if (IS_ERR(opp)) {
>> dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
>> return PTR_ERR(opp);
>> }
>> - rate = dev_pm_opp_get_freq(opp);
>> - dev_pm_opp_put(opp);
>> -
>> - err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
>> - if (err)
>> - return err;
>> -
>> - err = clk_set_rate(tegra->emc_clock, 0);
>> - if (err)
>> - goto restore_min_rate;
>>
>> - return 0;
>> -
>> -restore_min_rate:
>> - clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
>> + ret = dev_pm_opp_set_bw(dev, opp);
>> + dev_pm_opp_put(opp);
>>
>> - return err;
>> + return ret;
>> }
>>
>> static int tegra_devfreq_get_dev_status(struct device *dev,
>> @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev,
>> stat->private_data = tegra;
>>
>> /* The below are to be used by the other governors */
>> - stat->current_frequency = cur_freq;
>> + stat->current_frequency = cur_freq * KHZ;
>>
>> actmon_dev = &tegra->devices[MCALL];
>>
>> @@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq,
>> target_freq = max(target_freq, dev->target_freq);
>> }
>>
>> - *freq = target_freq;
>> + /*
>> + * tegra-devfreq driver operates with KHz units, while OPP table
>> + * entries use Hz units. Hence we need to convert the units for the
>> + * devfreq core.
>> + */
>> + *freq = target_freq * KHZ;
>>
>> return 0;
>> }
>> @@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {
>>
>> static int tegra_devfreq_probe(struct platform_device *pdev)
>> {
>> + u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
>> struct tegra_devfreq_device *dev;
>> struct tegra_devfreq *tegra;
>> struct devfreq *devfreq;
>> @@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>> long rate;
>> int err;
>>
>> + /* legacy device-trees don't have OPP table and must be updated */
>> + if (!device_property_present(&pdev->dev, "operating-points-v2")) {
>> + dev_err(&pdev->dev,
>> + "OPP table not found, please update your device tree\n");
>> + return -ENODEV;
>> + }
>> +
>
> You forgot to remove this ?

Yes, good catch. I'm planning to replace this code with a common helper
sometime soon, so if there won't be another reasons to make a new
revision, then I'd prefer to keep it as-is for now.

2020-11-17 20:27:07

by Georgi Djakov

[permalink] [raw]
Subject: Re: [PATCH v9 01/17] memory: tegra30: Support interconnect framework

Hi Dmitry,

Thank you working on this!

On 15.11.20 23:29, Dmitry Osipenko wrote:
> Now Internal and External memory controllers are memory interconnection
> providers. This allows us to use interconnect API for tuning of memory
> configuration. EMC driver now supports OPPs and DVFS. MC driver now
> supports tuning of memory arbitration latency, which needs to be done
> for ISO memory clients, like a Display client for example.
>
> Tested-by: Peter Geis <[email protected]>
> Signed-off-by: Dmitry Osipenko <[email protected]>
> ---
> drivers/memory/tegra/Kconfig | 1 +
> drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++--
> drivers/memory/tegra/tegra30.c | 173 +++++++++++++-
> 3 files changed, 501 insertions(+), 22 deletions(-)
>
[..]> diff --git a/drivers/memory/tegra/tegra30.c
b/drivers/memory/tegra/tegra30.c
> index d0314f29608d..ea849003014b 100644
> --- a/drivers/memory/tegra/tegra30.c
> +++ b/drivers/memory/tegra/tegra30.c
[..]
> +
> +static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node *dst)
> +{
> + struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider);
> + const struct tegra_mc_client *client = &mc->soc->clients[src->id];
> + u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
> +
> + /*
> + * Skip pre-initialization that is done by icc_node_add(), which sets
> + * bandwidth to maximum for all clients before drivers are loaded.
> + *
> + * This doesn't make sense for us because we don't have drivers for all
> + * clients and it's okay to keep configuration left from bootloader
> + * during boot, at least for today.
> + */
> + if (src == dst)
> + return 0;

Nit: The "proper" way to express this should be to implement the
.get_bw() callback to return zero as initial average/peak bandwidth.
I'm wondering if this will work here?

The rest looks good to me!

Thanks,
Georgi

2020-11-17 22:04:23

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v9 01/17] memory: tegra30: Support interconnect framework

17.11.2020 23:24, Georgi Djakov пишет:
> Hi Dmitry,
>
> Thank you working on this!
>
> On 15.11.20 23:29, Dmitry Osipenko wrote:
>> Now Internal and External memory controllers are memory interconnection
>> providers. This allows us to use interconnect API for tuning of memory
>> configuration. EMC driver now supports OPPs and DVFS. MC driver now
>> supports tuning of memory arbitration latency, which needs to be done
>> for ISO memory clients, like a Display client for example.
>>
>> Tested-by: Peter Geis <[email protected]>
>> Signed-off-by: Dmitry Osipenko <[email protected]>
>> ---
>>   drivers/memory/tegra/Kconfig       |   1 +
>>   drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++--
>>   drivers/memory/tegra/tegra30.c     | 173 +++++++++++++-
>>   3 files changed, 501 insertions(+), 22 deletions(-)
>>
> [..]> diff --git a/drivers/memory/tegra/tegra30.c
> b/drivers/memory/tegra/tegra30.c
>> index d0314f29608d..ea849003014b 100644
>> --- a/drivers/memory/tegra/tegra30.c
>> +++ b/drivers/memory/tegra/tegra30.c
> [..]
>> +
>> +static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node
>> *dst)
>> +{
>> +    struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider);
>> +    const struct tegra_mc_client *client = &mc->soc->clients[src->id];
>> +    u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
>> +
>> +    /*
>> +     * Skip pre-initialization that is done by icc_node_add(), which
>> sets
>> +     * bandwidth to maximum for all clients before drivers are loaded.
>> +     *
>> +     * This doesn't make sense for us because we don't have drivers
>> for all
>> +     * clients and it's okay to keep configuration left from bootloader
>> +     * during boot, at least for today.
>> +     */
>> +    if (src == dst)
>> +        return 0;
>
> Nit: The "proper" way to express this should be to implement the
> .get_bw() callback to return zero as initial average/peak bandwidth.
> I'm wondering if this will work here?
>
> The rest looks good to me!

Hello Georgi,

Returning zeros doesn't allow us to skip the initialization that is done
by provider->set(node, node) in icc_node_add(). It will reconfigure
memory latency in accordance to a zero memory bandwidth, which is wrong
to do.

It actually should be more preferred to preset bandwidth to a maximum
before all drivers are synced, but this should be done only once we will
wire up all drivers to use ICC framework. For now it's safer to keep the
default hardware configuration untouched.

2020-11-18 04:25:22

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH v9 07/17] PM / devfreq: tegra30: Support interconnect and OPPs from device-tree

On 17-11-20, 17:17, Dmitry Osipenko wrote:
> 17.11.2020 13:07, Viresh Kumar пишет:
> > On 16-11-20, 00:29, Dmitry Osipenko wrote:
> >> This patch moves ACTMON driver away from generating OPP table by itself,
> >> transitioning it to use the table which comes from device-tree. This
> >> change breaks compatibility with older device-trees in order to bring
> >> support for the interconnect framework to the driver. This is a mandatory
> >> change which needs to be done in order to implement interconnect-based
> >> memory DVFS. Users of legacy device-trees will get a message telling that
> >> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
> >> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
> >>
> >> Tested-by: Peter Geis <[email protected]>
> >> Tested-by: Nicolas Chauvet <[email protected]>
> >> Acked-by: Chanwoo Choi <[email protected]>
> >> Signed-off-by: Dmitry Osipenko <[email protected]>
> >> ---
> >> drivers/devfreq/tegra30-devfreq.c | 86 ++++++++++++++++---------------
> >> 1 file changed, 44 insertions(+), 42 deletions(-)
> >>
> >> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
> >> index 38cc0d014738..ed6d4469c8c7 100644
> >> --- a/drivers/devfreq/tegra30-devfreq.c
> >> +++ b/drivers/devfreq/tegra30-devfreq.c
> >> @@ -19,6 +19,8 @@
> >> #include <linux/reset.h>
> >> #include <linux/workqueue.h>
> >>
> >> +#include <soc/tegra/fuse.h>
> >> +
> >> #include "governor.h"
> >>
> >> #define ACTMON_GLB_STATUS 0x0
> >> @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
> >>
> >> struct tegra_devfreq {
> >> struct devfreq *devfreq;
> >> + struct opp_table *opp_table;
> >>
> >> struct reset_control *reset;
> >> struct clk *clock;
> >> @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
> >> static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
> >> u32 flags)
> >> {
> >> - struct tegra_devfreq *tegra = dev_get_drvdata(dev);
> >> - struct devfreq *devfreq = tegra->devfreq;
> >> struct dev_pm_opp *opp;
> >> - unsigned long rate;
> >> - int err;
> >> + int ret;
> >>
> >> opp = devfreq_recommended_opp(dev, freq, flags);
> >> if (IS_ERR(opp)) {
> >> dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
> >> return PTR_ERR(opp);
> >> }
> >> - rate = dev_pm_opp_get_freq(opp);
> >> - dev_pm_opp_put(opp);
> >> -
> >> - err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
> >> - if (err)
> >> - return err;
> >> -
> >> - err = clk_set_rate(tegra->emc_clock, 0);
> >> - if (err)
> >> - goto restore_min_rate;
> >>
> >> - return 0;
> >> -
> >> -restore_min_rate:
> >> - clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
> >> + ret = dev_pm_opp_set_bw(dev, opp);
> >> + dev_pm_opp_put(opp);
> >>
> >> - return err;
> >> + return ret;
> >> }
> >>
> >> static int tegra_devfreq_get_dev_status(struct device *dev,
> >> @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev,
> >> stat->private_data = tegra;
> >>
> >> /* The below are to be used by the other governors */
> >> - stat->current_frequency = cur_freq;
> >> + stat->current_frequency = cur_freq * KHZ;
> >>
> >> actmon_dev = &tegra->devices[MCALL];
> >>
> >> @@ -705,7 +693,12 @@ static int tegra_governor_get_target(struct devfreq *devfreq,
> >> target_freq = max(target_freq, dev->target_freq);
> >> }
> >>
> >> - *freq = target_freq;
> >> + /*
> >> + * tegra-devfreq driver operates with KHz units, while OPP table
> >> + * entries use Hz units. Hence we need to convert the units for the
> >> + * devfreq core.
> >> + */
> >> + *freq = target_freq * KHZ;
> >>
> >> return 0;
> >> }
> >> @@ -774,6 +767,7 @@ static struct devfreq_governor tegra_devfreq_governor = {
> >>
> >> static int tegra_devfreq_probe(struct platform_device *pdev)
> >> {
> >> + u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
> >> struct tegra_devfreq_device *dev;
> >> struct tegra_devfreq *tegra;
> >> struct devfreq *devfreq;
> >> @@ -781,6 +775,13 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
> >> long rate;
> >> int err;
> >>
> >> + /* legacy device-trees don't have OPP table and must be updated */
> >> + if (!device_property_present(&pdev->dev, "operating-points-v2")) {
> >> + dev_err(&pdev->dev,
> >> + "OPP table not found, please update your device tree\n");
> >> + return -ENODEV;
> >> + }
> >> +
> >
> > You forgot to remove this ?
>
> Yes, good catch. I'm planning to replace this code with a common helper
> sometime soon, so if there won't be another reasons to make a new
> revision, then I'd prefer to keep it as-is for now.

You should just replace this patch only with a version of V9.1 and you
aren't really required to resend the whole series. And you should fix
it before it gets merged. This isn't a formatting issue which we just
let through. I trust you when you say that you will fix it, but this
must be fixed now.

--
viresh

2020-11-18 15:34:44

by Georgi Djakov

[permalink] [raw]
Subject: Re: [PATCH v9 01/17] memory: tegra30: Support interconnect framework

On 18.11.20 0:02, Dmitry Osipenko wrote:
> 17.11.2020 23:24, Georgi Djakov пишет:
>> Hi Dmitry,
>>
>> Thank you working on this!
>>
>> On 15.11.20 23:29, Dmitry Osipenko wrote:
>>> Now Internal and External memory controllers are memory interconnection
>>> providers. This allows us to use interconnect API for tuning of memory
>>> configuration. EMC driver now supports OPPs and DVFS. MC driver now
>>> supports tuning of memory arbitration latency, which needs to be done
>>> for ISO memory clients, like a Display client for example.
>>>
>>> Tested-by: Peter Geis <[email protected]>
>>> Signed-off-by: Dmitry Osipenko <[email protected]>
>>> ---
>>>   drivers/memory/tegra/Kconfig       |   1 +
>>>   drivers/memory/tegra/tegra30-emc.c | 349 +++++++++++++++++++++++++++--
>>>   drivers/memory/tegra/tegra30.c     | 173 +++++++++++++-
>>>   3 files changed, 501 insertions(+), 22 deletions(-)
>>>
>> [..]> diff --git a/drivers/memory/tegra/tegra30.c
>> b/drivers/memory/tegra/tegra30.c
>>> index d0314f29608d..ea849003014b 100644
>>> --- a/drivers/memory/tegra/tegra30.c
>>> +++ b/drivers/memory/tegra/tegra30.c
>> [..]
>>> +
>>> +static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node
>>> *dst)
>>> +{
>>> +    struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider);
>>> +    const struct tegra_mc_client *client = &mc->soc->clients[src->id];
>>> +    u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
>>> +
>>> +    /*
>>> +     * Skip pre-initialization that is done by icc_node_add(), which
>>> sets
>>> +     * bandwidth to maximum for all clients before drivers are loaded.
>>> +     *
>>> +     * This doesn't make sense for us because we don't have drivers
>>> for all
>>> +     * clients and it's okay to keep configuration left from bootloader
>>> +     * during boot, at least for today.
>>> +     */
>>> +    if (src == dst)
>>> +        return 0;
>>
>> Nit: The "proper" way to express this should be to implement the
>> .get_bw() callback to return zero as initial average/peak bandwidth.
>> I'm wondering if this will work here?
>>
>> The rest looks good to me!
>
> Hello Georgi,
>
> Returning zeros doesn't allow us to skip the initialization that is done
> by provider->set(node, node) in icc_node_add(). It will reconfigure
> memory latency in accordance to a zero memory bandwidth, which is wrong
> to do.
>
> It actually should be more preferred to preset bandwidth to a maximum
> before all drivers are synced, but this should be done only once we will
> wire up all drivers to use ICC framework. For now it's safer to keep the
> default hardware configuration untouched.

Ok, thanks for clarifying! Is there a way to read this hardware
configuration and convert it to initial bandwidth? That's the
idea of the get_bw() callback actually. I am just curious and
trying to get a better understanding how this works and if it
would be useful for Tegra.

Thanks,
Georgi

2020-11-19 12:10:55

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v9 01/17] memory: tegra30: Support interconnect framework

18.11.2020 18:30, Georgi Djakov пишет:
> On 18.11.20 0:02, Dmitry Osipenko wrote:
>> 17.11.2020 23:24, Georgi Djakov пишет:
>>> Hi Dmitry,
>>>
>>> Thank you working on this!
>>>
>>> On 15.11.20 23:29, Dmitry Osipenko wrote:
>>>> Now Internal and External memory controllers are memory interconnection
>>>> providers. This allows us to use interconnect API for tuning of memory
>>>> configuration. EMC driver now supports OPPs and DVFS. MC driver now
>>>> supports tuning of memory arbitration latency, which needs to be done
>>>> for ISO memory clients, like a Display client for example.
>>>>
>>>> Tested-by: Peter Geis <[email protected]>
>>>> Signed-off-by: Dmitry Osipenko <[email protected]>
>>>> ---
>>>>    drivers/memory/tegra/Kconfig       |   1 +
>>>>    drivers/memory/tegra/tegra30-emc.c | 349
>>>> +++++++++++++++++++++++++++--
>>>>    drivers/memory/tegra/tegra30.c     | 173 +++++++++++++-
>>>>    3 files changed, 501 insertions(+), 22 deletions(-)
>>>>
>>> [..]> diff --git a/drivers/memory/tegra/tegra30.c
>>> b/drivers/memory/tegra/tegra30.c
>>>> index d0314f29608d..ea849003014b 100644
>>>> --- a/drivers/memory/tegra/tegra30.c
>>>> +++ b/drivers/memory/tegra/tegra30.c
>>> [..]
>>>> +
>>>> +static int tegra30_mc_icc_set(struct icc_node *src, struct icc_node
>>>> *dst)
>>>> +{
>>>> +    struct tegra_mc *mc = icc_provider_to_tegra_mc(src->provider);
>>>> +    const struct tegra_mc_client *client = &mc->soc->clients[src->id];
>>>> +    u64 peak_bandwidth = icc_units_to_bps(src->peak_bw);
>>>> +
>>>> +    /*
>>>> +     * Skip pre-initialization that is done by icc_node_add(), which
>>>> sets
>>>> +     * bandwidth to maximum for all clients before drivers are loaded.
>>>> +     *
>>>> +     * This doesn't make sense for us because we don't have drivers
>>>> for all
>>>> +     * clients and it's okay to keep configuration left from
>>>> bootloader
>>>> +     * during boot, at least for today.
>>>> +     */
>>>> +    if (src == dst)
>>>> +        return 0;
>>>
>>> Nit: The "proper" way to express this should be to implement the
>>> .get_bw() callback to return zero as initial average/peak bandwidth.
>>> I'm wondering if this will work here?
>>>
>>> The rest looks good to me!
>>
>> Hello Georgi,
>>
>> Returning zeros doesn't allow us to skip the initialization that is done
>> by provider->set(node, node) in icc_node_add(). It will reconfigure
>> memory latency in accordance to a zero memory bandwidth, which is wrong
>> to do.
>>
>> It actually should be more preferred to preset bandwidth to a maximum
>> before all drivers are synced, but this should be done only once we will
>> wire up all drivers to use ICC framework. For now it's safer to keep the
>> default hardware configuration untouched.
>
> Ok, thanks for clarifying! Is there a way to read this hardware
> configuration and convert it to initial bandwidth? That's the
> idea of the get_bw() callback actually. I am just curious and
> trying to get a better understanding how this works and if it
> would be useful for Tegra.

MC driver can't easily retrieve and convert initial bandwidths because
they depend on knowing hardware state that is not accessible by the MC
driver.

But in fact it's unnecessary to know the initial bandwidth in the case
of this MC ICC driver because if configuration is re-set to the same
value, then this is equal to leaving configuration unchanged.

It's okay to keep memory latency configuration unchanged if memory clock
rate goes up, which is what happens here during init. Please notice that
EMC ICC drivers (which control the clock rate) don't skip the initial
bandwidth change.

2020-11-19 21:06:38

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v9 07/17] PM / devfreq: tegra30: Support interconnect and OPPs from device-tree

18.11.2020 07:21, Viresh Kumar пишет:
...
>>>> + /* legacy device-trees don't have OPP table and must be updated */
>>>> + if (!device_property_present(&pdev->dev, "operating-points-v2")) {
>>>> + dev_err(&pdev->dev,
>>>> + "OPP table not found, please update your device tree\n");
>>>> + return -ENODEV;
>>>> + }
>>>> +
>>>
>>> You forgot to remove this ?
>>
>> Yes, good catch. I'm planning to replace this code with a common helper
>> sometime soon, so if there won't be another reasons to make a new
>> revision, then I'd prefer to keep it as-is for now.
>
> You should just replace this patch only with a version of V9.1 and you
> aren't really required to resend the whole series. And you should fix
> it before it gets merged. This isn't a formatting issue which we just
> let through. I trust you when you say that you will fix it, but this
> must be fixed now.
>

Should be easier to resend it all. I'll update it over the weekend, thanks.