2022-06-01 07:08:52

by Abhinav Kumar

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/dpu: Move min BW request and full BW disable back to mdss

Hi Doug

On 5/31/2022 7:28 AM, Douglas Anderson wrote:
> In commit a670ff578f1f ("drm/msm/dpu: always use mdp device to scale
> bandwidth") we fully moved interconnect stuff to the DPU driver. This
> had no change for sc7180 but _did_ have an impact for other SoCs. It
> made them match the sc7180 scheme.
>
> Unfortunately, the sc7180 scheme seems like it was a bit broken.
> Specifically the interconnect needs to be on for more than just the
> DPU driver's AXI bus. In the very least it also needs to be on for the
> DSI driver's AXI bus. This can be seen fairly easily by doing this on
> a ChromeOS sc7180-trogdor class device:
>
> set_power_policy --ac_screen_dim_delay=5 --ac_screen_off_delay=10
> sleep 10
> cd /sys/bus/platform/devices/ae94000.dsi/power
> echo on > control
>
> When you do that, you'll get a warning splat in the logs about
> "gcc_disp_hf_axi_clk status stuck at 'off'".
>
> One could argue that perhaps what I have done above is "illegal" and
> that it can't happen naturally in the system because in normal system
> usage the DPU is pretty much always on when DSI is on. That being
> said:
> * In official ChromeOS builds (admittedly a 5.4 kernel with backports)
> we have seen that splat at bootup.
> * Even though we don't use "autosuspend" for these components, we
> don't use the "put_sync" variants. Thus plausibly the DSI could stay
> "runtime enabled" past when the DPU is enabled. Techncially we
> shouldn't do that if the DPU's suspend ends up yanking our clock.
>
> Let's change things such that the "bare minimum" request for the
> interconnect happens in the mdss driver again. That means that all of
> the children can assume that the interconnect is on at the minimum
> bandwidth. We'll then let the DPU request the higher amount that it
> wants.
>
> It should be noted that this isn't as hacky of a solution as it might
> initially appear. Specifically:
> * Since MDSS and DPU individually get their own references to the
> interconnect then the framework will actually handle aggregating
> them. The two drivers are _not_ clobbering each other.
> * When the Qualcomm interconnect driver aggregates it takes the max of
> all the peaks. Thus having MDSS request a peak, as we're doing here,
> won't actually change the total interconnect bandwidth (it won't be
> added to the request for the DPU). This perhaps explains why the
> "average" requested in MDSS was historically 0 since that one
> _would_ be added in.
>
> NOTE also that in the downstream ChromeOS 5.4 and 5.15 kernels, we're
> also seeing some RPMH hangs that are addressed by this fix. These
> hangs are showing up in the field and on _some_ devices with enough
> stress testing of suspend/resume. Specifically right at suspend time
> with a stack crawl that looks like this (from chromeos-5.15 tree):
> rpmh_write_batch+0x19c/0x240
> qcom_icc_bcm_voter_commit+0x210/0x420
> qcom_icc_set+0x28/0x38
> apply_constraints+0x70/0xa4
> icc_set_bw+0x150/0x24c
> dpu_runtime_resume+0x50/0x1c4
> pm_generic_runtime_resume+0x30/0x44
> __genpd_runtime_resume+0x68/0x7c
> genpd_runtime_resume+0x12c/0x20c
> __rpm_callback+0x98/0x138
> rpm_callback+0x30/0x88
> rpm_resume+0x370/0x4a0
> __pm_runtime_resume+0x80/0xb0
> dpu_kms_enable_commit+0x24/0x30
> msm_atomic_commit_tail+0x12c/0x630
> commit_tail+0xac/0x150
> drm_atomic_helper_commit+0x114/0x11c
> drm_atomic_commit+0x68/0x78
> drm_atomic_helper_disable_all+0x158/0x1c8
> drm_atomic_helper_suspend+0xc0/0x1c0
> drm_mode_config_helper_suspend+0x2c/0x60
> msm_pm_prepare+0x2c/0x40
> pm_generic_prepare+0x30/0x44
> genpd_prepare+0x80/0xd0
> device_prepare+0x78/0x17c
> dpm_prepare+0xb0/0x384
> dpm_suspend_start+0x34/0xc0
>
> We don't completely understand all the mechanisms in play, but the
> hang seemed to come and go with random factors. It's not terribly
> surprising that the hang is gone after this patch since the line of
> code that was failing is no longer present in the kernel.
>
> Fixes: a670ff578f1f ("drm/msm/dpu: always use mdp device to scale bandwidth")
> Fixes: c33b7c0389e1 ("drm/msm/dpu: add support for clk and bw scaling for display")
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 8 ----
> drivers/gpu/drm/msm/msm_mdss.c | 58 +++++++++++++++++++++++++
> 2 files changed, 58 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> index 2b9d931474e0..3025184053e0 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> @@ -49,8 +49,6 @@
> #define DPU_DEBUGFS_DIR "msm_dpu"
> #define DPU_DEBUGFS_HWMASKNAME "hw_log_mask"
>
> -#define MIN_IB_BW 400000000ULL /* Min ib vote 400MB */
> -
> static int dpu_kms_hw_init(struct msm_kms *kms);
> static void _dpu_kms_mmu_destroy(struct dpu_kms *dpu_kms);
>
> @@ -1303,15 +1301,9 @@ static int __maybe_unused dpu_runtime_resume(struct device *dev)
> struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> struct drm_encoder *encoder;
> struct drm_device *ddev;
> - int i;
>
> ddev = dpu_kms->dev;
>
> - WARN_ON(!(dpu_kms->num_paths));
> - /* Min vote of BW is required before turning on AXI clk */
> - for (i = 0; i < dpu_kms->num_paths; i++)
> - icc_set_bw(dpu_kms->path[i], 0, Bps_to_icc(MIN_IB_BW));
> -
> rc = clk_bulk_prepare_enable(dpu_kms->num_clocks, dpu_kms->clocks);
> if (rc) {
> DPU_ERROR("clock enable failed rc:%d\n", rc);
> diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
> index 0454a571adf7..fe2c398d99b7 100644
> --- a/drivers/gpu/drm/msm/msm_mdss.c
> +++ b/drivers/gpu/drm/msm/msm_mdss.c
> @@ -5,6 +5,7 @@
>
> #include <linux/clk.h>
> #include <linux/delay.h>
> +#include <linux/interconnect.h>
> #include <linux/irq.h>
> #include <linux/irqchip.h>
> #include <linux/irqdesc.h>
> @@ -25,6 +26,8 @@
> #define UBWC_CTRL_2 0x150
> #define UBWC_PREDICTION_MODE 0x154
>
> +#define MIN_IB_BW 400000000UL /* Min ib vote 400MB */
> +
> struct msm_mdss {
> struct device *dev;
>
> @@ -36,8 +39,47 @@ struct msm_mdss {
> unsigned long enabled_mask;
> struct irq_domain *domain;
> } irq_controller;
> + struct icc_path *path[2];
> + u32 num_paths;
> };
>
> +static int msm_mdss_parse_data_bus_icc_path(struct device *dev,
> + struct msm_mdss *msm_mdss)
> +{
> + struct icc_path *path0 = of_icc_get(dev, "mdp0-mem");
> + struct icc_path *path1 = of_icc_get(dev, "mdp1-mem");
> +
> + if (IS_ERR_OR_NULL(path0))
> + return PTR_ERR_OR_ZERO(path0);
> +
> + msm_mdss->path[0] = path0;
> + msm_mdss->num_paths = 1;
> +
> + if (!IS_ERR_OR_NULL(path1)) {
> + msm_mdss->path[1] = path1;
> + msm_mdss->num_paths++;
> + }
> +
> + return 0;
> +}
> +
> +static void msm_mdss_put_icc_path(void *data)
> +{
> + struct msm_mdss *msm_mdss = data;
> + int i;
> +
> + for (i = 0; i < msm_mdss->num_paths; i++)
> + icc_put(msm_mdss->path[i]);
> +}
> +
> +static void msm_mdss_icc_request_bw(struct msm_mdss *msm_mdss, unsigned long bw)
> +{
> + int i;
> +
> + for (i = 0; i < msm_mdss->num_paths; i++)
> + icc_set_bw(msm_mdss->path[i], 0, Bps_to_icc(bw));
> +}
> +
> static void msm_mdss_irq(struct irq_desc *desc)
> {
> struct msm_mdss *msm_mdss = irq_desc_get_handler_data(desc);
> @@ -136,6 +178,13 @@ static int msm_mdss_enable(struct msm_mdss *msm_mdss)
> {
> int ret;
>
> + /*
> + * Several components have AXI clocks that can only be turned on if
> + * the interconnect is enabled (non-zero bandwidth). Let's make sure
> + * that the interconnects are at least at a minimum amount.
> + */
> + msm_mdss_icc_request_bw(msm_mdss, MIN_IB_BW);
> +

This patch does two things :

1) Move the min icc vote from the dpu_runtime_resume() path to the
msm_mdss_enable() so that while resuming, we add the min vote to the
parent device. We do need a min vote before turning on the AXI clk as
per this comment mentioned in c33b7c0389e1 (drm/msm/dpu: add support for
clk and bw scaling for display)


@@ -1101,8 +1129,15 @@ static int __maybe_unused
dpu_runtime_resume(struct device *dev)
struct drm_encoder *encoder;
struct drm_device *ddev;
struct dss_module_power *mp = &dpu_kms->mp;
+ int i;

ddev = dpu_kms->dev;
+
+ /* Min vote of BW is required before turning on AXI clk */
+ for (i = 0; i < dpu_kms->num_paths; i++)
+ icc_set_bw(dpu_kms->path[i], 0,
+ dpu_kms->catalog->perf.min_dram_ib);

So I understand and I am fine with this part.

2) Add a min vote in msm_mdss_init().

This is the part I am not able to fully follow.

If we only need to add the min vote before voting for the clocks, adding
it in mdss_mdss_enable() should be enough.

Do we need this part of adding the min vote to the msm_mdss_init()?

If so, why?

> ret = clk_bulk_prepare_enable(msm_mdss->num_clocks, msm_mdss->clocks);
> if (ret) {
> dev_err(msm_mdss->dev, "clock enable failed, ret:%d\n", ret);
> @@ -178,6 +227,7 @@ static int msm_mdss_enable(struct msm_mdss *msm_mdss)
> static int msm_mdss_disable(struct msm_mdss *msm_mdss)
> {
> clk_bulk_disable_unprepare(msm_mdss->num_clocks, msm_mdss->clocks);
> + msm_mdss_icc_request_bw(msm_mdss, 0);
>
> return 0;
> }
> @@ -271,6 +321,14 @@ static struct msm_mdss *msm_mdss_init(struct platform_device *pdev, bool is_mdp5
>
> dev_dbg(&pdev->dev, "mapped mdss address space @%pK\n", msm_mdss->mmio);
>
> + ret = msm_mdss_parse_data_bus_icc_path(&pdev->dev, msm_mdss);
> + if (ret)
> + return ERR_PTR(ret);
> + ret = devm_add_action_or_reset(&pdev->dev, msm_mdss_put_icc_path, msm_mdss);
> + if (ret)
> + return ERR_PTR(ret);
> + msm_mdss_icc_request_bw(msm_mdss, MIN_IB_BW);
> +
> if (is_mdp5)
> ret = mdp5_mdss_parse_clock(pdev, &msm_mdss->clocks);
> else


2022-06-01 18:51:06

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] drm/msm/dpu: Move min BW request and full BW disable back to mdss

Hi,

On Tue, May 31, 2022 at 2:29 PM Abhinav Kumar <[email protected]> wrote:
>
> > @@ -136,6 +178,13 @@ static int msm_mdss_enable(struct msm_mdss *msm_mdss)
> > {
> > int ret;
> >
> > + /*
> > + * Several components have AXI clocks that can only be turned on if
> > + * the interconnect is enabled (non-zero bandwidth). Let's make sure
> > + * that the interconnects are at least at a minimum amount.
> > + */
> > + msm_mdss_icc_request_bw(msm_mdss, MIN_IB_BW);
> > +
>
> This patch does two things :
>
> 1) Move the min icc vote from the dpu_runtime_resume() path to the
> msm_mdss_enable() so that while resuming, we add the min vote to the
> parent device. We do need a min vote before turning on the AXI clk as
> per this comment mentioned in c33b7c0389e1 (drm/msm/dpu: add support for
> clk and bw scaling for display)
>
>
> @@ -1101,8 +1129,15 @@ static int __maybe_unused
> dpu_runtime_resume(struct device *dev)
> struct drm_encoder *encoder;
> struct drm_device *ddev;
> struct dss_module_power *mp = &dpu_kms->mp;
> + int i;
>
> ddev = dpu_kms->dev;
> +
> + /* Min vote of BW is required before turning on AXI clk */
> + for (i = 0; i < dpu_kms->num_paths; i++)
> + icc_set_bw(dpu_kms->path[i], 0,
> + dpu_kms->catalog->perf.min_dram_ib);
>
> So I understand and I am fine with this part.
>
> 2) Add a min vote in msm_mdss_init().
>
> This is the part I am not able to fully follow.
>
> If we only need to add the min vote before voting for the clocks, adding
> it in mdss_mdss_enable() should be enough.
>
> Do we need this part of adding the min vote to the msm_mdss_init()?
>
> If so, why?

Ah, good question. Mostly I was matching how things were done before
commit a670ff578f1f ("drm/msm/dpu: always use mdp device to scale
bandwidth"). Before that commit we used to put the min vote in the
init path.

...but you're right, I don't see any good reason for it. I'll send a
v2 removing that line from msm_mdss_init().

-Doug