2023-08-11 07:16:23

by Jiri Slaby

[permalink] [raw]
Subject: [PATCH] serial: 8250: drop lockdep annotation from serial8250_clear_IER()

The port lock is not always held when calling serial8250_clear_IER().
When an oops is in progress, the lock is tried to be taken and when it
is not, a warning is issued:
WARNING: CPU: 0 PID: 1 at drivers/tty/serial/8250/8250_port.c:707 +0x57/0x60
Modules linked in:
CPU: 0 PID: 1 Comm: init Not tainted 6.5.0-rc5-1.g225bfb7-default+ #774 00f1be860db663ed29479b8255d3b01ab1135bd3
Hardware name: QEMU Standard PC ...
RIP: 0010:serial8250_clear_IER+0x57/0x60
...
Call Trace:
<TASK>
serial8250_console_write+0x9e/0x4b0
console_flush_all+0x217/0x5f0
...

Therefore, remove the annotation as it doesn't hold for all invocations.

The other option would be to make the lockdep test conditional on
'oops_in_progress' or pass 'locked' from serial8250_console_write(). I
don't think, that is worth it.

Signed-off-by: Jiri Slaby (SUSE) <[email protected]>
Reported-by: Vlastimil Babka <[email protected]>
Cc: John Ogness <[email protected]>
Fixes: d0b309a5d3f4 (serial: 8250: synchronize and annotate UART_IER access)
---
drivers/tty/serial/8250/8250_port.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index ecfdc4534123..f59328e1c35d 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -703,9 +703,6 @@ static void serial8250_set_sleep(struct uart_8250_port *p, int sleep)

static void serial8250_clear_IER(struct uart_8250_port *up)
{
- /* Port locked to synchronize UART_IER access against the console. */
- lockdep_assert_held_once(&up->port.lock);
-
if (up->capabilities & UART_CAP_UUE)
serial_out(up, UART_IER, UART_IER_UUE);
else
--
2.41.0



2023-08-14 06:40:41

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: drop lockdep annotation from serial8250_clear_IER()

On 14. 08. 23, 8:15, John Ogness wrote:
> On 2023-08-11, "Jiri Slaby (SUSE)" <[email protected]> wrote:
>> The port lock is not always held when calling serial8250_clear_IER().
>> When an oops is in progress, the lock is tried to be taken and when it
>> is not, a warning is issued:
>
> Yes, and that is a potential deadlock. The warning is correct.

Could you elaborate on how can not-taking a lock be a potential deadlock?

>> Therefore, remove the annotation as it doesn't hold for all invocations.
>
> ... because those invocations are broken by design.

Perhaps. But the system is crashing. Better to emit something without
the lock rather than nothing (and wait for the lock infinitely).

>> The other option would be to make the lockdep test conditional on
>> 'oops_in_progress' or pass 'locked' from serial8250_console_write(). I
>> don't think, that is worth it.
>
> The proper thing to do is to fix the invocation. The upcoming atomic
> console implementation for the 8250 does exactly that.

So what does it do?

> If this patch gets accepted (which it appears it will be), I will revert
> it in my series implementing the 8250 atomic console.

That's fine as soon as the warning is not a problem.

thanks,
--
js
suse labs


2023-08-14 06:44:50

by John Ogness

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: drop lockdep annotation from serial8250_clear_IER()

On 2023-08-11, "Jiri Slaby (SUSE)" <[email protected]> wrote:
> The port lock is not always held when calling serial8250_clear_IER().
> When an oops is in progress, the lock is tried to be taken and when it
> is not, a warning is issued:

Yes, and that is a potential deadlock. The warning is correct.

> Therefore, remove the annotation as it doesn't hold for all invocations.

... because those invocations are broken by design.

> The other option would be to make the lockdep test conditional on
> 'oops_in_progress' or pass 'locked' from serial8250_console_write(). I
> don't think, that is worth it.

The proper thing to do is to fix the invocation. The upcoming atomic
console implementation for the 8250 does exactly that.

If this patch gets accepted (which it appears it will be), I will revert
it in my series implementing the 8250 atomic console.

John Ogness

2023-08-14 09:47:11

by John Ogness

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: drop lockdep annotation from serial8250_clear_IER()

Hi Jiri,

Thanks for the follow-up. You responded faster than I could correct
myself.

On 2023-08-14, Jiri Slaby <[email protected]> wrote:
>>> The port lock is not always held when calling serial8250_clear_IER().
>>> When an oops is in progress, the lock is tried to be taken and when it
>>> is not, a warning is issued:
>>
>> Yes, and that is a potential deadlock. The warning is correct.
>
> Could you elaborate on how can not-taking a lock be a potential
> deadlock?

I was wrong to say deadlock. The lockdep annotation is about interrupts
being unintentionally left permanently disabled or being enabled while
another CPU is transmitting.

>>> Therefore, remove the annotation as it doesn't hold for all invocations.
>>
>> ... because those invocations are broken by design.
>
> Perhaps. But the system is crashing. Better to emit something without
> the lock rather than nothing (and wait for the lock infinitely).

I am not suggesting to wait infinitely. I am merely pointing out that
the lockdep warning is legitimate.

>>> The other option would be to make the lockdep test conditional on
>>> 'oops_in_progress'

Actually I find this suggestion more appropriate. It makes it clear that
we are willing to take such risks and do not want to see the warnings in
a panic situation. However, I would end up having to revert that change
as well, so it really does not matter to me at this point. Either way I
will be reverting this patch.

>> The proper thing to do is to fix the invocation. The upcoming atomic
>> console implementation for the 8250 does exactly that.
>
> So what does it do?

The upcoming atomic consoles use a new type of synchronization to guard
the IER register (priority-based spinning with timeouts). This allows us
to make intelligent decisions about how and when to flush in a panic,
rather than simply ignorning locks and hoping for the best.

>> If this patch gets accepted (which it appears it will be), I will revert
>> it in my series implementing the 8250 atomic console.
>
> That's fine as soon as the warning is not a problem.

Yes, I am also fine with re-introducing the annotation together with the
8250 series.

John

2023-08-14 11:32:44

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: drop lockdep annotation from serial8250_clear_IER()

On 14. 08. 23, 12:00, Petr Mladek wrote:
> I personally vote to keep it as is unless people see this warning
> on daily basis. After all, the lockdep splat is correct. The serial
> console might not work correctly in panic() when there is the race.

Sorry, but no, the warning is not correct at all. The code path
deliberately does NOT take the lock and calls a function which is
currently annotated that the lock is _always_ taken. Therefore, the
warning is clearly a false positive and I see no reason in keeping it.

Hopefully this is fixed as John described earlier. Until then, I see no
point bothering people with false alarms.

thanks,
--
js
suse labs


2023-08-14 11:43:03

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: drop lockdep annotation from serial8250_clear_IER()

On Mon 2023-08-14 10:21:14, John Ogness wrote:
> Hi Jiri,
>
> Thanks for the follow-up. You responded faster than I could correct
> myself.
>
> On 2023-08-14, Jiri Slaby <[email protected]> wrote:
> >>> The port lock is not always held when calling serial8250_clear_IER().
> >>> When an oops is in progress, the lock is tried to be taken and when it
> >>> is not, a warning is issued:
> >>
> >>> The other option would be to make the lockdep test conditional on
> >>> 'oops_in_progress'
>
> Actually I find this suggestion more appropriate. It makes it clear that
> we are willing to take such risks and do not want to see the warnings in
> a panic situation. However, I would end up having to revert that change
> as well, so it really does not matter to me at this point. Either way I
> will be reverting this patch.

A "solution" would be to move debug_locks_off() before bust_spinlocks(1)
in panic().

debug_locks_off() is currently called before console_flush_on_panic().
I guess that it is because it ignores the result of console_trylock().
But the particular console drivers ignore a trylock result on
the port_lock already after the earlier bust_spinlocks(1).

My concern is that it would hide any other potential races, for
example, in __crash_kexec() or panic notifiers. So, I think that
it might cause more harm then good. Especially because the race
is quite uncommon. It requires activity on the serial port from
two processes during panic().

I personally vote to keep it as is unless people see this warning
on daily basis. After all, the lockdep splat is correct. The serial
console might not work correctly in panic() when there is the race.

Best Regards,
Petr

2023-08-15 13:53:20

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: drop lockdep annotation from serial8250_clear_IER()

On 15. 08. 23, 11:27, Petr Mladek wrote:
> On Mon 2023-08-14 12:21:23, Jiri Slaby wrote:
>> On 14. 08. 23, 12:00, Petr Mladek wrote:
>>> I personally vote to keep it as is unless people see this warning
>>> on daily basis. After all, the lockdep splat is correct. The serial
>>> console might not work correctly in panic() when there is the race.
>>
>> Sorry, but no, the warning is not correct at all. The code path deliberately
>> does NOT take the lock and calls a function which is currently annotated
>> that the lock is _always_ taken. Therefore, the warning is clearly a false
>> positive and I see no reason in keeping it.
>
> There might be a misunderstanding. I only want to keep panic()
> implementation as it is for now. I mean to keep calling
> debug_locks_off() right before console_flush_on_panic().
> The lockdep should stay on before to report potential problems
> in non-printk code, like kexec, panic notifiers.
>
> But I am fine with disabling the particular lockdep_assert_held_once()
> during panic().
>
> It should stay during the normal system state to catch not
> yet discovered races. John is working hard on preventing any
> races which might blow up after introducing the printk kthreads.
>
> I mean something like:
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index ecfdc4534123..9533c1eedfb1 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -704,7 +704,8 @@ static void serial8250_set_sleep(struct uart_8250_port *p, int sleep)
> static void serial8250_clear_IER(struct uart_8250_port *up)
> {
> /* Port locked to synchronize UART_IER access against the console. */
> - lockdep_assert_held_once(&up->port.lock);
> + if (!oops_in_progress)
> + lockdep_assert_held_once(&up->port.lock);

Yes, this is one of my suggestions ;). (Which I thought are not worth
it, but I am not opposing either.)

thanks,
--
js
suse labs