On Tue, Jan 16, 2024 at 04:32:31AM -0300, Leonardo Bras wrote:
> While dealing with a bug that breaks the serial console from qemu (8250)
> after printing a lot of data, I found some issues on this driver on RT
> as well as spurious IRQ behaviors that don't seem to be adeqate for RT.
>
> Comments:
> Patch #1:
> I found out this driver get an IRQ request for every tx byte, but the
> handler is able to deal with sending multiple bytes per "thread wake up".
>
> Since the irqs_unhandled keep growing, and theads_handled don't change
> as often, after some intense load (tx ~300kBytes) the serial will
> disable the IRQ line for this driver, which ultimately breaks the console.
>
> My fist solution kept track of how many requests given handler dealt with,
> which got added to theads_handled. On note_interrupt I got the diff from
> theads_handled_last and subtracted that diff from irqs_unhandled.
>
> This solution required a change in the irqreturn_t typedef and a bunch of
> helpers and defines, as well as adapting the 8250 driver.
> At the end seemed like a overcomplicated solution for the issue, but it
> can be an alternative if the current solution is considered imprecise.
>
> Mu cyrrent solution on patch #1 is much simpler, just keeping the
> IRQ enabled as long as the irq_thread deal with any IRQ request before
> irqs_unhandled hitting the limit value.
>
> Patch #2:
> In RT, the 8250 driver has an issue if it's interrupted while holding the
> port->lock. If the interruption needs to printk() anything, it
> will try to get the port->lock, which is busy, so spin_lock() will try
> to reschedule the interruption, which is in atomic context, and will
> trigger a bug.
>
> This bug reproduces quite often, like in 50% of tests I did.
>
> The only thing I could think of for fixing this is using in_atomic()
> when PREEMPT_RT=y, so it makes use of the same mechanism as for
> oops_in_progress to avoid getting the lock if it's busy. It's working
> just fine.
>
> Yeah, I got the warning in checkpath:
> "ERROR: do not use in_atomic in drivers"
>
> So I need some feedback on what to do to avoid this bug, if not
> by using in_atomic() at this driver.
>
> Since this one is linked to the console, any printk will try to get
> this drivers port->lock, and so it's kind of hard to avoid this accesses.
>
> I though on doing an interface for spin_lock_only_if_can_sleep() but
> it seemed overkill.
>
> Please provide comments / feedback.
>
> Thanks!
> Leo
>
>
> Leonardo Bras (2):
> irq/spurious: Reset irqs_unhandled if an irq_thread handles one IRQ
> request
> serial/8250: Avoid getting lock in RT atomic context
>
> drivers/tty/serial/8250/8250_port.c | 2 +-
> kernel/irq/spurious.c | 8 ++++++++
> 2 files changed, 9 insertions(+), 1 deletion(-)
>
>
> base-commit: 052d534373b7ed33712a63d5e17b2b6cdbce84fd
> --
> 2.43.0
>
Resent this one with the correct order / patches.