2023-07-30 20:44:09

by Sicelo A. Mhlongo

[permalink] [raw]
Subject: [PATCH] bus: omap_l3_smx: identify timeout source before rebooting

Identify and print the error source before rebooting the board due to an l3
application timeout error, by delaying the BUG_ON. This is helpful when
debugging, e.g. via serial.

Signed-off-by: Sicelo A. Mhlongo <[email protected]>
---
drivers/bus/omap_l3_smx.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/bus/omap_l3_smx.c b/drivers/bus/omap_l3_smx.c
index bb1606f5ce2d..70f4903d5468 100644
--- a/drivers/bus/omap_l3_smx.c
+++ b/drivers/bus/omap_l3_smx.c
@@ -170,11 +170,9 @@ static irqreturn_t omap3_l3_app_irq(int irq, void *_l3)
status = omap3_l3_readll(l3->rt, L3_SI_FLAG_STATUS_0);
/*
* if we have a timeout error, there's nothing we can
- * do besides rebooting the board. So let's BUG on any
- * of such errors and handle the others. timeout error
- * is severe and not expected to occur.
+ * do besides rebooting the board after identifying the
+ * error source.
*/
- BUG_ON(status & L3_STATUS_0_TIMEOUT_MASK);
} else {
status = omap3_l3_readll(l3->rt, L3_SI_FLAG_STATUS_1);
/* No timeout error for debug sources */
@@ -190,6 +188,12 @@ static irqreturn_t omap3_l3_app_irq(int irq, void *_l3)
ret |= omap3_l3_block_irq(l3, error, error_addr);
}

+ /*
+ * BUG on application timeout errors since they are severe and not
+ * expected to occur.
+ */
+ BUG_ON(status & L3_STATUS_0_TIMEOUT_MASK);
+
/* Clear the status register */
clear = (L3_AGENT_STATUS_CLEAR_IA << int_type) |
L3_AGENT_STATUS_CLEAR_TA;
--
2.40.1



2023-07-31 05:51:32

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH] bus: omap_l3_smx: identify timeout source before rebooting

* Sicelo A. Mhlongo <[email protected]> [230730 20:23]:
> Identify and print the error source before rebooting the board due to an l3
> application timeout error, by delaying the BUG_ON. This is helpful when
> debugging, e.g. via serial.

Makes sense to try to show some information, but please see the question
below.

> diff --git a/drivers/bus/omap_l3_smx.c b/drivers/bus/omap_l3_smx.c
> index bb1606f5ce2d..70f4903d5468 100644
> --- a/drivers/bus/omap_l3_smx.c
> +++ b/drivers/bus/omap_l3_smx.c
> @@ -170,11 +170,9 @@ static irqreturn_t omap3_l3_app_irq(int irq, void *_l3)
> status = omap3_l3_readll(l3->rt, L3_SI_FLAG_STATUS_0);
> /*
> * if we have a timeout error, there's nothing we can
> - * do besides rebooting the board. So let's BUG on any
> - * of such errors and handle the others. timeout error
> - * is severe and not expected to occur.
> + * do besides rebooting the board after identifying the
> + * error source.
> */
> - BUG_ON(status & L3_STATUS_0_TIMEOUT_MASK);
> } else {
> status = omap3_l3_readll(l3->rt, L3_SI_FLAG_STATUS_1);
> /* No timeout error for debug sources */
> @@ -190,6 +188,12 @@ static irqreturn_t omap3_l3_app_irq(int irq, void *_l3)
> ret |= omap3_l3_block_irq(l3, error, error_addr);
> }
>
> + /*
> + * BUG on application timeout errors since they are severe and not
> + * expected to occur.
> + */
> + BUG_ON(status & L3_STATUS_0_TIMEOUT_MASK);

Aren't you now checking the bit for both L3_SI_FLAG_STATUS_0 and
L3_SI_FLAG_STATUS_1 register values? I think it should be only for register
L3_SI_FLAG_STATUS_0 value?

Regards,

Tony

2023-07-31 09:47:51

by Sicelo A. Mhlongo

[permalink] [raw]
Subject: Re: [PATCH] bus: omap_l3_smx: identify timeout source before rebooting

Hi,

On Mon, Jul 31, 2023 at 08:29:04AM +0300, Tony Lindgren wrote:
> * Sicelo A. Mhlongo <[email protected]> [230730 20:23]:
> > Identify and print the error source before rebooting the board due to an l3
> > application timeout error, by delaying the BUG_ON. This is helpful when
> > debugging, e.g. via serial.
>
> Makes sense to try to show some information, but please see the question
> below.
>
> > diff --git a/drivers/bus/omap_l3_smx.c b/drivers/bus/omap_l3_smx.c
> > index bb1606f5ce2d..70f4903d5468 100644
> > --- a/drivers/bus/omap_l3_smx.c
> > +++ b/drivers/bus/omap_l3_smx.c
> > @@ -170,11 +170,9 @@ static irqreturn_t omap3_l3_app_irq(int irq, void *_l3)
> > status = omap3_l3_readll(l3->rt, L3_SI_FLAG_STATUS_0);
> > /*
> > * if we have a timeout error, there's nothing we can
> > - * do besides rebooting the board. So let's BUG on any
> > - * of such errors and handle the others. timeout error
> > - * is severe and not expected to occur.
> > + * do besides rebooting the board after identifying the
> > + * error source.
> > */
> > - BUG_ON(status & L3_STATUS_0_TIMEOUT_MASK);
> > } else {
> > status = omap3_l3_readll(l3->rt, L3_SI_FLAG_STATUS_1);
> > /* No timeout error for debug sources */
> > @@ -190,6 +188,12 @@ static irqreturn_t omap3_l3_app_irq(int irq, void *_l3)
> > ret |= omap3_l3_block_irq(l3, error, error_addr);
> > }
> >
> > + /*
> > + * BUG on application timeout errors since they are severe and not
> > + * expected to occur.
> > + */
> > + BUG_ON(status & L3_STATUS_0_TIMEOUT_MASK);
>
> Aren't you now checking the bit for both L3_SI_FLAG_STATUS_0 and
> L3_SI_FLAG_STATUS_1 register values? I think it should be only for register
> L3_SI_FLAG_STATUS_0 value?
>

Ah, you are right. It should be:

`BUG_ON(!int_type && status & L3_STATUS_0_TIMEOUT_MASK);`

I'll send in a v2.

Thanks
Sicelo