2023-12-07 06:13:37

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH] irqchip/gic-v3-its: BUG_ON if stall bit is set

There could be various reason that stall bit could
be set due to software errors while processing
commands in command queue is being processed and
waiting for 1s is not going to help in debugging
as command processing anyways going to be timed
out and system will continue to run and may crash
after some time due to this.

So, to debug such issues what command caused the
stall bit to set, BUG_ON right away.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 9a7a74239eab..8983e0a3318c 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1078,6 +1078,11 @@ static int its_wait_for_range_completion(struct its_node *its,
s64 delta;

rd_idx = readl_relaxed(its->base + GITS_CREADR);
+ /*
+ * Check for stall bit as there is no point in waiting
+ * for 1s if the stall bit is already set.
+ */
+ BUG_ON(rd_idx & 1);

/*
* Compute the read pointer progress, taking the
--
2.7.4


2023-12-07 07:43:18

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v3-its: BUG_ON if stall bit is set

On Thu, 07 Dec 2023 06:12:39 +0000,
Mukesh Ojha <[email protected]> wrote:
>
> There could be various reason that stall bit could
> be set due to software errors while processing
> commands in command queue is being processed and

Such as?

> waiting for 1s is not going to help in debugging
> as command processing anyways going to be timed
> out and system will continue to run and may crash
> after some time due to this.
>
> So, to debug such issues what command caused the
> stall bit to set, BUG_ON right away.

How on Earth will killing the system allow *anything* to be further
debugged?

If you need debug information, add the correct debug statements using
pr_debug(). Even better, try to gracefully recover from it if the ITS
command queue supports restarting.

Crashing the system is not an option.

M.

--
Without deviation from the norm, progress is not possible.