2022-01-21 21:10:48

by Nikita Yushchenko

[permalink] [raw]
Subject: [PATCH v2] media: vsp1: mask interrupts before enabling

VSP hardware could be used (e.g. by the bootloader) before driver load,
and some interrupts could be left in enabled and pending state. In this
case, setting up VSP interrupt handler without masking interrupts before
causes interrupt handler to be immediately called (and crash due to null
vsp->info dereference).

Fix that by explicitly masking all interrupts before setting the interrupt
handler. To do so, have to set the interrupt handler later, after hw
revision is already detected and number of interrupts to mask gets
known.

Based on patch by Koji Matsuoka <[email protected]> included
in the Renesas BSP kernel.

Signed-off-by: Nikita Yushchenko <[email protected]>
---
v1: https://lore.kernel.org/all/[email protected]/
Changes since v1:
- move interrupt masking to a dedicated routine
- update comments and patch description

> I think I would rather see the code to reset them done in
> vsp1_reset_wpf(), rather than in probe directly as that is what we are
> doing, and is I believe already in the call path.

First, vsp1_reset_wpf() does not get called on driver early init.

It is normally called from within vsp1_device_get() when device is powered
on, but vsp1_probe() calls vsp1_device_get() when vsp1->info is not yet set,
and in this case call from vsp1_pm_runtime_resume() to vsp1_device_init()
is skipped.

I've tried to add extra vsp1_device_put() / vsp1_device_get() calls to the
probe path, and dumped related registers in vsp1_pm_runtime_resume() after
return from vsp1_device_init(), and got

[ 2.477315][ T1] vsp1 fea28000.vsp: VI6_DISP_IRQ_ENB(0) = 0x00000100
[ 2.483933][ T1] vsp1 fea28000.vsp: VI6_DISP_IRQ_STA(0) = 0x00000121
[ 2.490556][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_ENB(0) = 0x00010002
[ 2.497088][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_STA(0) = 0x00010003
[ 2.503618][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_ENB(1) = 0x00000000
[ 2.510148][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_STA(1) = 0x00000000

which shows that
(1) WPF interrupt is not cleared by WPF reset,
(2) also DISP interrupt is enabled and pending, and driver does not seem
to control it at all.

Given that, I think it is safer to explicitly mask all interrupts before
setting the handler. I've moved interrupt masking to a separate routine.

> (But I'm reallly ... reallly concerned that the hardware is not really
> getting reset when it should, and that might merit some further
> investigation).

The documentation for WFP reset bit has notes that under some situations,
reset is postponed for a long time, and reported via interrupt. I'm not
sure what exactly goes on there, but I'd assume that such logic implies
that interrupt subsystem is not reset.

I agree that not having exact understand of hardware state is not good.
But, given that no signs of misfunction have been detected for a long time
(the patch was in vendor BSP for years), I think we can assume it is
"safe enough".

drivers/media/platform/vsp1/vsp1_drv.c | 34 ++++++++++++++++++++------
1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c
index c9044785b903..92a95e2c21c7 100644
--- a/drivers/media/platform/vsp1/vsp1_drv.c
+++ b/drivers/media/platform/vsp1/vsp1_drv.c
@@ -550,6 +550,16 @@ static int vsp1_device_init(struct vsp1_device *vsp1)
return 0;
}

+static void vsp1_mask_all_interrupts(struct vsp1_device *vsp1)
+{
+ int i;
+
+ for (i = 0; i < vsp1->info->lif_count; ++i)
+ vsp1_write(vsp1, VI6_DISP_IRQ_ENB(i), 0);
+ for (i = 0; i < vsp1->info->wpf_count; ++i)
+ vsp1_write(vsp1, VI6_WPF_IRQ_ENB(i), 0);
+}
+
/*
* vsp1_device_get - Acquire the VSP1 device
*
@@ -819,13 +829,6 @@ static int vsp1_probe(struct platform_device *pdev)
return -EINVAL;
}

- ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
- IRQF_SHARED, dev_name(&pdev->dev), vsp1);
- if (ret < 0) {
- dev_err(&pdev->dev, "failed to request IRQ\n");
- return ret;
- }
-
/* FCP (optional). */
fcp_node = of_parse_phandle(pdev->dev.of_node, "renesas,fcp", 0);
if (fcp_node) {
@@ -855,7 +858,6 @@ static int vsp1_probe(struct platform_device *pdev)
goto done;

vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION);
- vsp1_device_put(vsp1);

for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) {
if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) ==
@@ -868,12 +870,28 @@ static int vsp1_probe(struct platform_device *pdev)
if (!vsp1->info) {
dev_err(&pdev->dev, "unsupported IP version 0x%08x\n",
vsp1->version);
+ vsp1_device_put(vsp1);
ret = -ENXIO;
goto done;
}

dev_dbg(&pdev->dev, "IP version 0x%08x\n", vsp1->version);

+ /*
+ * Previous use of the hardware (e.g. by the bootloader) could leave
+ * some interrupts enabled and pending.
+ */
+ vsp1_mask_all_interrupts(vsp1);
+
+ vsp1_device_put(vsp1);
+
+ ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
+ IRQF_SHARED, dev_name(&pdev->dev), vsp1);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "failed to request IRQ\n");
+ goto done;
+ }
+
/* Instantiate entities. */
ret = vsp1_create_entities(vsp1);
if (ret < 0) {
--
2.30.2


2022-01-24 14:02:30

by Laurent Pinchart

[permalink] [raw]
Subject: Re: [PATCH v2] media: vsp1: mask interrupts before enabling

Hi Nikita,

Thank you for the patch.

On Thu, Jan 20, 2022 at 11:15:30AM +0300, Nikita Yushchenko wrote:
> VSP hardware could be used (e.g. by the bootloader) before driver load,
> and some interrupts could be left in enabled and pending state. In this
> case, setting up VSP interrupt handler without masking interrupts before
> causes interrupt handler to be immediately called (and crash due to null
> vsp->info dereference).
>
> Fix that by explicitly masking all interrupts before setting the interrupt
> handler. To do so, have to set the interrupt handler later, after hw
> revision is already detected and number of interrupts to mask gets
> known.
>
> Based on patch by Koji Matsuoka <[email protected]> included
> in the Renesas BSP kernel.
>
> Signed-off-by: Nikita Yushchenko <[email protected]>
> ---
> v1: https://lore.kernel.org/all/[email protected]/
> Changes since v1:
> - move interrupt masking to a dedicated routine
> - update comments and patch description
>
> > I think I would rather see the code to reset them done in
> > vsp1_reset_wpf(), rather than in probe directly as that is what we are
> > doing, and is I believe already in the call path.
>
> First, vsp1_reset_wpf() does not get called on driver early init.
>
> It is normally called from within vsp1_device_get() when device is powered
> on, but vsp1_probe() calls vsp1_device_get() when vsp1->info is not yet set,
> and in this case call from vsp1_pm_runtime_resume() to vsp1_device_init()
> is skipped.
>
> I've tried to add extra vsp1_device_put() / vsp1_device_get() calls to the
> probe path, and dumped related registers in vsp1_pm_runtime_resume() after
> return from vsp1_device_init(), and got
>
> [ 2.477315][ T1] vsp1 fea28000.vsp: VI6_DISP_IRQ_ENB(0) = 0x00000100
> [ 2.483933][ T1] vsp1 fea28000.vsp: VI6_DISP_IRQ_STA(0) = 0x00000121
> [ 2.490556][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_ENB(0) = 0x00010002
> [ 2.497088][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_STA(0) = 0x00010003
> [ 2.503618][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_ENB(1) = 0x00000000
> [ 2.510148][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_STA(1) = 0x00000000
>
> which shows that
> (1) WPF interrupt is not cleared by WPF reset,
> (2) also DISP interrupt is enabled and pending, and driver does not seem
> to control it at all.
>
> Given that, I think it is safer to explicitly mask all interrupts before
> setting the handler. I've moved interrupt masking to a separate routine.
>
> > (But I'm reallly ... reallly concerned that the hardware is not really
> > getting reset when it should, and that might merit some further
> > investigation).
>
> The documentation for WFP reset bit has notes that under some situations,
> reset is postponed for a long time, and reported via interrupt. I'm not
> sure what exactly goes on there, but I'd assume that such logic implies
> that interrupt subsystem is not reset.
>
> I agree that not having exact understand of hardware state is not good.
> But, given that no signs of misfunction have been detected for a long time
> (the patch was in vendor BSP for years), I think we can assume it is
> "safe enough".

We could also use the reset provided by the SRCR registers in the CPG.
In the meantime, this patch seems to go in the right direction, it fixes
the problem without being too intrusive.

> drivers/media/platform/vsp1/vsp1_drv.c | 34 ++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c
> index c9044785b903..92a95e2c21c7 100644
> --- a/drivers/media/platform/vsp1/vsp1_drv.c
> +++ b/drivers/media/platform/vsp1/vsp1_drv.c
> @@ -550,6 +550,16 @@ static int vsp1_device_init(struct vsp1_device *vsp1)
> return 0;
> }
>
> +static void vsp1_mask_all_interrupts(struct vsp1_device *vsp1)
> +{
> + int i;

This should be unsigned int as it never takes negative values.

> +
> + for (i = 0; i < vsp1->info->lif_count; ++i)
> + vsp1_write(vsp1, VI6_DISP_IRQ_ENB(i), 0);
> + for (i = 0; i < vsp1->info->wpf_count; ++i)
> + vsp1_write(vsp1, VI6_WPF_IRQ_ENB(i), 0);
> +}
> +
> /*
> * vsp1_device_get - Acquire the VSP1 device
> *
> @@ -819,13 +829,6 @@ static int vsp1_probe(struct platform_device *pdev)
> return -EINVAL;
> }
>
> - ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
> - IRQF_SHARED, dev_name(&pdev->dev), vsp1);
> - if (ret < 0) {
> - dev_err(&pdev->dev, "failed to request IRQ\n");
> - return ret;
> - }
> -
> /* FCP (optional). */
> fcp_node = of_parse_phandle(pdev->dev.of_node, "renesas,fcp", 0);
> if (fcp_node) {
> @@ -855,7 +858,6 @@ static int vsp1_probe(struct platform_device *pdev)
> goto done;
>
> vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION);
> - vsp1_device_put(vsp1);
>
> for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) {
> if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) ==
> @@ -868,12 +870,28 @@ static int vsp1_probe(struct platform_device *pdev)
> if (!vsp1->info) {
> dev_err(&pdev->dev, "unsupported IP version 0x%08x\n",
> vsp1->version);
> + vsp1_device_put(vsp1);
> ret = -ENXIO;
> goto done;
> }
>
> dev_dbg(&pdev->dev, "IP version 0x%08x\n", vsp1->version);
>
> + /*
> + * Previous use of the hardware (e.g. by the bootloader) could leave
> + * some interrupts enabled and pending.

I would add

* TODO: Investigate if this shouldn't be better handled by using the
* device reset provided by the CPG.

to avoid forgetting about it.

If you're fine with those two changes, there's no need for a v3, I can
handle this when applying the patch.

Reviewed-by: Laurent Pinchart <[email protected]>

> + */
> + vsp1_mask_all_interrupts(vsp1);
> +
> + vsp1_device_put(vsp1);
> +
> + ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
> + IRQF_SHARED, dev_name(&pdev->dev), vsp1);
> + if (ret < 0) {
> + dev_err(&pdev->dev, "failed to request IRQ\n");
> + goto done;
> + }
> +
> /* Instantiate entities. */
> ret = vsp1_create_entities(vsp1);
> if (ret < 0) {

--
Regards,

Laurent Pinchart

2022-01-24 17:07:55

by Nikita Yushchenko

[permalink] [raw]
Subject: Re: [PATCH v2] media: vsp1: mask interrupts before enabling

> If you're fine with those two changes, there's no need for a v3, I can
> handle this when applying the patch.

Ok, please do that.

Nikita

2022-01-24 19:01:31

by Kieran Bingham

[permalink] [raw]
Subject: Re: [PATCH v2] media: vsp1: mask interrupts before enabling

Hi Nikita,

Thank you for the v2,

Quoting Nikita Yushchenko (2022-01-20 08:15:30)
> VSP hardware could be used (e.g. by the bootloader) before driver load,
> and some interrupts could be left in enabled and pending state. In this
> case, setting up VSP interrupt handler without masking interrupts before
> causes interrupt handler to be immediately called (and crash due to null
> vsp->info dereference).
>
> Fix that by explicitly masking all interrupts before setting the interrupt
> handler. To do so, have to set the interrupt handler later, after hw
> revision is already detected and number of interrupts to mask gets
> known.
>
> Based on patch by Koji Matsuoka <[email protected]> included
> in the Renesas BSP kernel.
>
> Signed-off-by: Nikita Yushchenko <[email protected]>
> ---
> v1: https://lore.kernel.org/all/[email protected]/
> Changes since v1:
> - move interrupt masking to a dedicated routine
> - update comments and patch description
>
> > I think I would rather see the code to reset them done in
> > vsp1_reset_wpf(), rather than in probe directly as that is what we are
> > doing, and is I believe already in the call path.
>
> First, vsp1_reset_wpf() does not get called on driver early init.
>
> It is normally called from within vsp1_device_get() when device is powered
> on, but vsp1_probe() calls vsp1_device_get() when vsp1->info is not yet set,
> and in this case call from vsp1_pm_runtime_resume() to vsp1_device_init()
> is skipped.

aha, I seem to have a long mail to you stuck in my drafts that didn't
make it out to reply to your last mail. I'm sorry about that, but it
looks like we're progressing anyway.

But in that draft, I agreed that yes, we have catch 22 around setting
that ->info, as we need to power up to determine the specific model.

> I've tried to add extra vsp1_device_put() / vsp1_device_get() calls to the
> probe path, and dumped related registers in vsp1_pm_runtime_resume() after
> return from vsp1_device_init(), and got
>
> [ 2.477315][ T1] vsp1 fea28000.vsp: VI6_DISP_IRQ_ENB(0) = 0x00000100
> [ 2.483933][ T1] vsp1 fea28000.vsp: VI6_DISP_IRQ_STA(0) = 0x00000121
> [ 2.490556][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_ENB(0) = 0x00010002
> [ 2.497088][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_STA(0) = 0x00010003
> [ 2.503618][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_ENB(1) = 0x00000000
> [ 2.510148][ T1] vsp1 fea28000.vsp: VI6_WPF_IRQ_STA(1) = 0x00000000
>
> which shows that
> (1) WPF interrupt is not cleared by WPF reset,
> (2) also DISP interrupt is enabled and pending, and driver does not seem
> to control it at all.
>
> Given that, I think it is safer to explicitly mask all interrupts before
> setting the handler. I've moved interrupt masking to a separate routine.

Thankyou, at least being a separate function makes it a bit cleaner and
clear that it's not specifically a part of probe() but something we act
upon to clean up.


> > (But I'm reallly ... reallly concerned that the hardware is not really
> > getting reset when it should, and that might merit some further
> > investigation).
>
> The documentation for WFP reset bit has notes that under some situations,
> reset is postponed for a long time, and reported via interrupt. I'm not
> sure what exactly goes on there, but I'd assume that such logic implies
> that interrupt subsystem is not reset.
>
> I agree that not having exact understand of hardware state is not good.
> But, given that no signs of misfunction have been detected for a long time
> (the patch was in vendor BSP for years), I think we can assume it is
> "safe enough".

Yes, well I think this is adding the safety by re-asserting our
assumptions on probe time. Which sounds like a good thing given the
circumstances.


Reviewed-by: Kieran Bingham <[email protected]>

> drivers/media/platform/vsp1/vsp1_drv.c | 34 ++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c
> index c9044785b903..92a95e2c21c7 100644
> --- a/drivers/media/platform/vsp1/vsp1_drv.c
> +++ b/drivers/media/platform/vsp1/vsp1_drv.c
> @@ -550,6 +550,16 @@ static int vsp1_device_init(struct vsp1_device *vsp1)
> return 0;
> }
>
> +static void vsp1_mask_all_interrupts(struct vsp1_device *vsp1)
> +{
> + int i;
> +
> + for (i = 0; i < vsp1->info->lif_count; ++i)
> + vsp1_write(vsp1, VI6_DISP_IRQ_ENB(i), 0);
> + for (i = 0; i < vsp1->info->wpf_count; ++i)
> + vsp1_write(vsp1, VI6_WPF_IRQ_ENB(i), 0);
> +}
> +
> /*
> * vsp1_device_get - Acquire the VSP1 device
> *
> @@ -819,13 +829,6 @@ static int vsp1_probe(struct platform_device *pdev)
> return -EINVAL;
> }
>
> - ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
> - IRQF_SHARED, dev_name(&pdev->dev), vsp1);
> - if (ret < 0) {
> - dev_err(&pdev->dev, "failed to request IRQ\n");
> - return ret;
> - }
> -
> /* FCP (optional). */
> fcp_node = of_parse_phandle(pdev->dev.of_node, "renesas,fcp", 0);
> if (fcp_node) {
> @@ -855,7 +858,6 @@ static int vsp1_probe(struct platform_device *pdev)
> goto done;
>
> vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION);
> - vsp1_device_put(vsp1);
>
> for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) {
> if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) ==
> @@ -868,12 +870,28 @@ static int vsp1_probe(struct platform_device *pdev)
> if (!vsp1->info) {
> dev_err(&pdev->dev, "unsupported IP version 0x%08x\n",
> vsp1->version);
> + vsp1_device_put(vsp1);
> ret = -ENXIO;
> goto done;
> }
>
> dev_dbg(&pdev->dev, "IP version 0x%08x\n", vsp1->version);
>
> + /*
> + * Previous use of the hardware (e.g. by the bootloader) could leave
> + * some interrupts enabled and pending.
> + */
> + vsp1_mask_all_interrupts(vsp1);
> +
> + vsp1_device_put(vsp1);
> +
> + ret = devm_request_irq(&pdev->dev, irq->start, vsp1_irq_handler,
> + IRQF_SHARED, dev_name(&pdev->dev), vsp1);
> + if (ret < 0) {
> + dev_err(&pdev->dev, "failed to request IRQ\n");
> + goto done;
> + }
> +
> /* Instantiate entities. */
> ret = vsp1_create_entities(vsp1);
> if (ret < 0) {
> --
> 2.30.2
>