2018-05-04 16:31:22

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v2] serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version

From: Daniel Wagner <[email protected]>

Commit 40f70c03e33a ("serial: sh-sci: add locking to console write
function to avoid SMP lockup") copied the strategy to avoid locking
problems in conjuncture with the console from the UART8250
driver. Instead using directly spin_{try}lock_irqsave(),
local_irq_save() followed by spin_{try}lock() was used. While this is
correct on mainline, for -rt it is a problem. spin_{try}lock() will
check if it is running in a valid context. Since the local_irq_save()
has already been executed, the context has changed and
spin_{try}lock() will complain. The reason why spin_{try}lock()
complains is that on -rt the spin locks are turned into mutexes and
therefore can sleep. Sleeping with interrupts disabled is not valid.

BUG: sleeping function called from invalid context at /home/wagi/work/rt/v4.4-cip-rt/kernel/locking/rtmutex.c:995
in_atomic(): 0, irqs_disabled(): 128, pid: 778, name: irq/76-eth0
CPU: 0 PID: 778 Comm: irq/76-eth0 Not tainted 4.4.126-test-cip22-rt14-00403-gcd03665c8318 #12
Hardware name: Generic RZ/G1 (Flattened Device Tree)
Backtrace:
[<c00140a0>] (dump_backtrace) from [<c001424c>] (show_stack+0x18/0x1c)
r7:c06b01f0 r6:60010193 r5:00000000 r4:c06b01f0
[<c0014234>] (show_stack) from [<c01d3c94>] (dump_stack+0x78/0x94)
[<c01d3c1c>] (dump_stack) from [<c004c134>] (___might_sleep+0x134/0x194)
r7:60010113 r6:c06d3559 r5:00000000 r4:ffffe000
[<c004c000>] (___might_sleep) from [<c04ded60>] (rt_spin_lock+0x20/0x74)
r5:c06f4d60 r4:c06f4d60
[<c04ded40>] (rt_spin_lock) from [<c02577e4>] (serial_console_write+0x100/0x118)
r5:c06f4d60 r4:c06f4d60
[<c02576e4>] (serial_console_write) from [<c0061060>] (call_console_drivers.constprop.15+0x10c/0x124)
r10:c06d2894 r9:c04e18b0 r8:00000028 r7:00000000 r6:c06d3559 r5:c06d2798
r4:c06b9914 r3:c02576e4
[<c0060f54>] (call_console_drivers.constprop.15) from [<c0062984>] (console_unlock+0x32c/0x430)
r10:c06d30d8 r9:00000028 r8:c06dd518 r7:00000005 r6:00000000 r5:c06d2798
r4:c06d2798 r3:00000028
[<c0062658>] (console_unlock) from [<c0062e1c>] (vprintk_emit+0x394/0x4f0)
r10:c06d2798 r9:c06d30ee r8:00000006 r7:00000005 r6:c06a78fc r5:00000027
r4:00000003
[<c0062a88>] (vprintk_emit) from [<c0062fa0>] (vprintk+0x28/0x30)
r10:c060bd46 r9:00001000 r8:c06b9a90 r7:c06b9a90 r6:c06b994c r5:c06b9a3c
r4:c0062fa8
[<c0062f78>] (vprintk) from [<c0062fb8>] (vprintk_default+0x10/0x14)
[<c0062fa8>] (vprintk_default) from [<c009cd30>] (printk+0x78/0x84)
[<c009ccbc>] (printk) from [<c025afdc>] (credit_entropy_bits+0x17c/0x2cc)
r3:00000001 r2:decade60 r1:c061a5ee r0:c061a523
r4:00000006
[<c025ae60>] (credit_entropy_bits) from [<c025bf74>] (add_interrupt_randomness+0x160/0x178)
r10:466e7196 r9:1f536000 r8:fffeef74 r7:00000000 r6:c06b9a60 r5:c06b9a3c
r4:dfbcf680
[<c025be14>] (add_interrupt_randomness) from [<c006536c>] (irq_thread+0x1e8/0x248)
r10:c006537c r9:c06cdf21 r8:c0064fcc r7:df791c24 r6:df791c00 r5:ffffe000
r4:df525180
[<c0065184>] (irq_thread) from [<c003fba4>] (kthread+0x108/0x11c)
r10:00000000 r9:00000000 r8:c0065184 r7:df791c00 r6:00000000 r5:df791d00
r4:decac000
[<c003fa9c>] (kthread) from [<c00101b8>] (ret_from_fork+0x14/0x3c)
r8:00000000 r7:00000000 r6:00000000 r5:c003fa9c r4:df791d00

Cc: Geert Uytterhoeven <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---

changes since v1:
- Ported to current mainline (initial version was against v4.4.y)
- Left local_irq_save() in place when spinlocks are not used as suggested
by Geert.

drivers/tty/serial/sh-sci.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index fdbbff547106..ec7417be9e08 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -2890,16 +2890,16 @@ static void serial_console_write(struct console *co, const char *s,
unsigned long flags;
int locked = 1;

- local_irq_save(flags);
#if defined(SUPPORT_SYSRQ)
- if (port->sysrq)
+ if (port->sysrq) {
locked = 0;
- else
+ local_irq_save(flags);
+ } else
#endif
if (oops_in_progress)
- locked = spin_trylock(&port->lock);
+ locked = spin_trylock_irqsave(&port->lock, flags);
else
- spin_lock(&port->lock);
+ spin_lock_irqsave(&port->lock, flags);

/* first save SCSCR then disable interrupts, keep clock source */
ctrl = serial_port_in(port, SCSCR);
@@ -2919,8 +2919,9 @@ static void serial_console_write(struct console *co, const char *s,
serial_port_out(port, SCSCR, ctrl);

if (locked)
- spin_unlock(&port->lock);
- local_irq_restore(flags);
+ spin_unlock_irqrestore(&port->lock, flags);
+ else
+ local_irq_restore(flags);
}

static int serial_console_setup(struct console *co, char *options)
--
2.14.3


2018-05-07 07:15:10

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v2] serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version

Hi Daniel,

On Fri, May 4, 2018 at 6:30 PM, Daniel Wagner <[email protected]> wrote:
> From: Daniel Wagner <[email protected]>
>
> Commit 40f70c03e33a ("serial: sh-sci: add locking to console write
> function to avoid SMP lockup") copied the strategy to avoid locking
> problems in conjuncture with the console from the UART8250
> driver. Instead using directly spin_{try}lock_irqsave(),
> local_irq_save() followed by spin_{try}lock() was used. While this is
> correct on mainline, for -rt it is a problem. spin_{try}lock() will
> check if it is running in a valid context. Since the local_irq_save()
> has already been executed, the context has changed and
> spin_{try}lock() will complain. The reason why spin_{try}lock()
> complains is that on -rt the spin locks are turned into mutexes and
> therefore can sleep. Sleeping with interrupts disabled is not valid.

[...]

> ---
>
> changes since v1:
> - Ported to current mainline (initial version was against v4.4.y)
> - Left local_irq_save() in place when spinlocks are not used as suggested
> by Geert.

Thanks for the update!

> --- a/drivers/tty/serial/sh-sci.c
> +++ b/drivers/tty/serial/sh-sci.c
> @@ -2890,16 +2890,16 @@ static void serial_console_write(struct console *co, const char *s,
> unsigned long flags;
> int locked = 1;
>
> - local_irq_save(flags);
> #if defined(SUPPORT_SYSRQ)
> - if (port->sysrq)
> + if (port->sysrq) {
> locked = 0;
> - else
> + local_irq_save(flags);
> + } else
> #endif
> if (oops_in_progress)
> - locked = spin_trylock(&port->lock);
> + locked = spin_trylock_irqsave(&port->lock, flags);

If the spinlock could not be taken, interrupts are re-enabled:

include/linux/spinlock.h:
#define raw_spin_trylock_irqsave(lock, flags) \
({ \
local_irq_save(flags); \
raw_spin_trylock(lock) ? \
1 : ({ local_irq_restore(flags); 0; }); \
})

hence I think you need to check for this and disable interrupts again:

if (!locked)
local_irq_save(flags);

> else
> - spin_lock(&port->lock);
> + spin_lock_irqsave(&port->lock, flags);
>
> /* first save SCSCR then disable interrupts, keep clock source */
> ctrl = serial_port_in(port, SCSCR);
> @@ -2919,8 +2919,9 @@ static void serial_console_write(struct console *co, const char *s,
> serial_port_out(port, SCSCR, ctrl);
>
> if (locked)
> - spin_unlock(&port->lock);
> - local_irq_restore(flags);
> + spin_unlock_irqrestore(&port->lock, flags);
> + else
> + local_irq_restore(flags);
> }
>
> static int serial_console_setup(struct console *co, char *options)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Subject: Re: [PATCH v2] serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version

On 2018-05-04 18:30:41 [+0200], Daniel Wagner wrote:
> --- a/drivers/tty/serial/sh-sci.c
> +++ b/drivers/tty/serial/sh-sci.c
> @@ -2890,16 +2890,16 @@ static void serial_console_write(struct console *co, const char *s,
> unsigned long flags;
> int locked = 1;
>
> - local_irq_save(flags);
> #if defined(SUPPORT_SYSRQ)
> - if (port->sysrq)
> + if (port->sysrq) {
> locked = 0;
> - else
> + local_irq_save(flags);

how is this helping? You should see a splat after a sysrq request.

> + } else
> #endif
> if (oops_in_progress)
> - locked = spin_trylock(&port->lock);
> + locked = spin_trylock_irqsave(&port->lock, flags);
> else
> - spin_lock(&port->lock);
> + spin_lock_irqsave(&port->lock, flags);
>
> /* first save SCSCR then disable interrupts, keep clock source */
> ctrl = serial_port_in(port, SCSCR);

Sebastian

2018-05-08 07:21:01

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH v2] serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version

Hi Sebastian,

On 05/07/2018 02:51 PM, Sebastian Andrzej Siewior wrote:
> On 2018-05-04 18:30:41 [+0200], Daniel Wagner wrote:
>> --- a/drivers/tty/serial/sh-sci.c
>> +++ b/drivers/tty/serial/sh-sci.c
>> @@ -2890,16 +2890,16 @@ static void serial_console_write(struct console *co, const char *s,
>> unsigned long flags;
>> int locked = 1;
>>
>> - local_irq_save(flags);
>> #if defined(SUPPORT_SYSRQ)
>> - if (port->sysrq)
>> + if (port->sysrq) {
>> locked = 0;
>> - else
>> + local_irq_save(flags);
>
> how is this helping? You should see a splat after a sysrq request.

You are right, I didn't really think this through.

Should 'echo t > /proc/sysrq' trigger the splat? At least I was so naive
that think it would be enough.

Thanks,
Daniel

Subject: Re: [PATCH v2] serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version

On 2018-05-08 09:18:44 [+0200], Daniel Wagner wrote:
> Hi Sebastian,
Hi,

> Should 'echo t > /proc/sysrq' trigger the splat? At least I was so naive
> that think it would be enough.

No, this is a different interface. The thing is you send the BREAK
command and then, the next character (within a timeout) that comes over
the UART is the MAGIC request. While that character is being received
the sysrq request is answered and output is written to the console which
is probably the UART and so you can't acquire the lock again.

So you can't acquire the lock. I added a try_lock once to at least try
to acquire the lock because this may race against a printk() from
another CPU. But then someone complained about a failed try_lock on UP
(this can't happen on UP unless in a scenario like this where the lock
is already acquired) so this change to the 8250 was reverted.

So the whole situation of the console interface is not perfect and this
is the duct tape we have. It would good to have the same duct tape in
all drivers or come up with a different interface to cover corner cases
:)

To reproduce the sysrq code path: You need to the UART as a console and
send a BREAK (CTRL-A F in minicom) followed by `t' to match your
example.

> Thanks,
> Daniel

Sebastian