2019-02-16 11:23:12

by Hongbo Yao

[permalink] [raw]
Subject: possible deadlock in console_unlock

hi, sergey:

As shown in that link, https://lkml.org/lkml/2018/6/6/397

On the linux kernel 5.0-rc6, Syzkaller also hit 'possible deadlock in console_unlock'
bug for several times in my environment.

This solution fixes things for me. Do you have a plan to submit patches to
solve this problem.

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
__printk_safe_enter();
kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
__printk_safe_exit();



Best regards,
Hongbo



2019-02-16 11:27:54

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock

On (02/16/19 14:36), Yao HongBo wrote:
> hi, sergey:
>
> As shown in that link, https://lkml.org/lkml/2018/6/6/397
>
> On the linux kernel 5.0-rc6, Syzkaller also hit 'possible deadlock in console_unlock'
> bug for several times in my environment.
>
> This solution fixes things for me. Do you have a plan to submit patches to
> solve this problem.
>
> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> __printk_safe_enter();
> kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
> __printk_safe_exit();

I would probably try the following:

---

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index ec145a59f199..058d004dcbaa 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -172,7 +172,8 @@ static struct tty_buffer *tty_buffer_alloc(struct tty_port *port, size_t size)
have queued and recycle that ? */
if (atomic_read(&port->buf.mem_used) > port->buf.mem_limit)
return NULL;
- p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
+ p = kmalloc(sizeof(struct tty_buffer) + 2 * size,
+ GFP_ATOMIC | __GFP_NOWARN);
if (p == NULL)
return NULL;


2019-02-16 11:29:53

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock

On (02/16/19 16:21), Sergey Senozhatsky wrote:
> On (02/16/19 14:36), Yao HongBo wrote:
> > hi, sergey:
> >
> > As shown in that link, https://lkml.org/lkml/2018/6/6/397
> >
> > On the linux kernel 5.0-rc6, Syzkaller also hit 'possible deadlock in console_unlock'
> > bug for several times in my environment.
> >
> > This solution fixes things for me. Do you have a plan to submit patches to
> > solve this problem.
> >
> > diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> > __printk_safe_enter();
> > kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
> > __printk_safe_exit();
>
> I would probably try the following:

Yao HongBo, could you please post the lockdep splat?

GFP_NOWARN is probably the best option for now. Yes, it, maybe,
will not work for fault-injection cases; but printk_safe approach
is harder to push for, especially given that printk_safe maybe will
not even exist in the future.

-ss

2019-02-16 11:31:30

by Hongbo Yao

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock


On 2/16/2019 3:46 PM, Sergey Senozhatsky wrote:
> On (02/16/19 16:21), Sergey Senozhatsky wrote:
>> On (02/16/19 14:36), Yao HongBo wrote:
>>> hi, sergey:
>>>
>>> As shown in that link, https://lkml.org/lkml/2018/6/6/397
>>>
>>> On the linux kernel 5.0-rc6, Syzkaller also hit 'possible deadlock in console_unlock'
>>> bug for several times in my environment.
>>>
>>> This solution fixes things for me. Do you have a plan to submit patches to
>>> solve this problem.
>>>
>>> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
>>> __printk_safe_enter();
>>> kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
>>> __printk_safe_exit();
>>
>> I would probably try the following:
>
> Yao HongBo, could you please post the lockdep splat?
>
> GFP_NOWARN is probably the best option for now. Yes, it, maybe,
> will not work for fault-injection cases; but printk_safe approach
> is harder to push for, especially given that printk_safe maybe will
> not even exist in the future.

I have tried GFP_NOWARN, but the problem still exists.
Only print_safe contexts for tty locks can solve the problem.
My test scenario is falt-injection.

deadlock report is shown as below:

RBP: 00007f1cf76cbc70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1cf76cc6bc
R13: 00000000004c473d R14: 0000000000701f18 R15: 0000000000000005

======================================================
WARNING: possible circular locking dependency detected
4.19.18-514.55.6.9.x86_64+ #1 Not tainted
------------------------------------------------------
syz-executor0/23291 is trying to acquire lock:
00000000d73d87c0 (console_owner){-.-.}, at: log_next kernel/printk/printk.c:495 [inline]
00000000d73d87c0 (console_owner){-.-.}, at: console_unlock+0x36d/0xb30 kernel/printk/printk.c:2397

but task is already holding lock:
00000000dfbab914 (&(&port->lock)->rlock){-.-.}, at: pty_write+0xd2/0x1d0 drivers/tty/pty.c:119

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&(&port->lock)->rlock){-.-.}:
tty_port_tty_get+0x20/0x80 drivers/tty/tty_port.c:288
tty_port_default_wakeup+0x16/0x40 drivers/tty/tty_port.c:47
serial8250_tx_chars+0x4dc/0xa80 drivers/tty/serial/8250/8250_port.c:1806
serial8250_handle_irq.part.12+0x198/0x220 drivers/tty/serial/8250/8250_port.c:1879
serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1899 [inline]
serial8250_default_handle_irq+0xf8/0x120 drivers/tty/serial/8250/8250_port.c:1895
serial8250_interrupt+0xfe/0x250 drivers/tty/serial/8250/8250_core.c:125
__handle_irq_event_percpu+0xf5/0x730 kernel/irq/handle.c:149
handle_irq_event_percpu+0x7b/0x170 kernel/irq/handle.c:189
handle_irq_event+0xa6/0x140 kernel/irq/handle.c:206
handle_edge_irq+0x1eb/0xa90 kernel/irq/chip.c:791
generic_handle_irq_desc include/linux/irqdesc.h:154 [inline]
handle_irq+0x3e/0x50 arch/x86/kernel/irq_64.c:78
do_IRQ+0x92/0x200 arch/x86/kernel/irq.c:246
ret_from_intr+0x0/0x22
native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:57
arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline]
default_idle+0x24/0x2b0 arch/x86/kernel/process.c:561
cpuidle_idle_call kernel/sched/idle.c:153 [inline]
do_idle+0x2ca/0x420 kernel/sched/idle.c:262
cpu_startup_entry+0xcb/0xe0 kernel/sched/idle.c:368
start_secondary+0x421/0x570 arch/x86/kernel/smpboot.c:271
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

-> #1 (&port_lock_key){-.-.}:
serial8250_console_write+0x68a/0x820 drivers/tty/serial/8250/8250_port.c:3247
call_console_drivers kernel/printk/printk.c:1729 [inline]
console_unlock+0x66a/0xb30 kernel/printk/printk.c:2410
vprintk_emit+0x181/0x570 kernel/printk/printk.c:1927
vprintk_default+0x68/0xe0 kernel/printk/printk.c:1968
vprintk_func+0x57/0xf0 kernel/printk/printk_safe.c:398
printk+0xb7/0xe2 kernel/printk/printk.c:2001
register_console+0x752/0xc60 kernel/printk/printk.c:2725
univ8250_console_init+0x31/0x3a drivers/tty/serial/8250/8250_core.c:685
console_init+0x3ad/0x567 kernel/printk/printk.c:2811
start_kernel+0x4c3/0x7e1 init/main.c:661
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

-> #0 (console_owner){-.-.}:
console_lock_spinning_enable kernel/printk/printk.c:1592 [inline]
console_unlock+0x3d9/0xb30 kernel/printk/printk.c:2407
vprintk_emit+0x181/0x570 kernel/printk/printk.c:1927
vprintk_default+0x68/0xe0 kernel/printk/printk.c:1968
vprintk_func+0x57/0xf0 kernel/printk/printk_safe.c:398
printk+0xb7/0xe2 kernel/printk/printk.c:2001
fail_dump lib/fault-inject.c:44 [inline]
should_fail+0x5d3/0x700 lib/fault-inject.c:149
__should_failslab+0x110/0x180 mm/failslab.c:32
should_failslab+0xa/0x20 mm/slab_common.c:1557
slab_pre_alloc_hook mm/slab.h:423 [inline]
slab_alloc_node mm/slub.c:2632 [inline]
slab_alloc mm/slub.c:2714 [inline]
__kmalloc+0x6e/0x350 mm/slub.c:3747
kmalloc include/linux/slab.h:518 [inline]
tty_buffer_alloc drivers/tty/tty_buffer.c:170 [inline]
__tty_buffer_request_room+0x1cf/0x5e0 drivers/tty/tty_buffer.c:268
tty_insert_flip_string_fixed_flag+0x8f/0x220 drivers/tty/tty_buffer.c:313
tty_insert_flip_string include/linux/tty_flip.h:37 [inline]
pty_write+0x104/0x1d0 drivers/tty/pty.c:121
n_tty_write+0x9a3/0xd90 drivers/tty/n_tty.c:2354
do_tty_write drivers/tty/tty_io.c:958 [inline]
tty_write+0x451/0x8a0 drivers/tty/tty_io.c:1042
__vfs_write+0xef/0x6a0 fs/read_write.c:485
vfs_write+0x184/0x4c0 fs/read_write.c:549
ksys_write+0xc6/0x1a0 fs/read_write.c:598
do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
console_owner --> &port_lock_key --> &(&port->lock)->rlock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&(&port->lock)->rlock);
lock(&port_lock_key);
lock(&(&port->lock)->rlock);
lock(console_owner);

*** DEADLOCK ***


> -ss
>
>


2019-02-18 06:07:12

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock

Hi,

On (02/16/19 15:59), Yao HongBo wrote:
> > GFP_NOWARN is probably the best option for now. Yes, it, maybe,
> > will not work for fault-injection cases; but printk_safe approach
> > is harder to push for, especially given that printk_safe maybe will
> > not even exist in the future.
>
> I have tried GFP_NOWARN, but the problem still exists.
> Only print_safe contexts for tty locks can solve the problem.
> My test scenario is falt-injection.

Oh, I see. Yes, fault-injection is special.

I suspect that this patch series can be helpful then
https://lore.kernel.org/lkml/[email protected]/T/#u

but first we need to figure out if printk_safe will
stay in the kernel (this will take some time).

-ss

2019-02-18 12:33:12

by Hongbo Yao

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock



On 2/18/2019 1:46 PM, Sergey Senozhatsky wrote:
> Hi,
>
> On (02/16/19 15:59), Yao HongBo wrote:
>>> GFP_NOWARN is probably the best option for now. Yes, it, maybe,
>>> will not work for fault-injection cases; but printk_safe approach
>>> is harder to push for, especially given that printk_safe maybe will
>>> not even exist in the future.
>>
>> I have tried GFP_NOWARN, but the problem still exists.
>> Only print_safe contexts for tty locks can solve the problem.
>> My test scenario is falt-injection.
>
> Oh, I see. Yes, fault-injection is special.
>
> I suspect that this patch series can be helpful then
> https://lore.kernel.org/lkml/[email protected]/T/#u

ok, i'll try it.
Thanks.

> but first we need to figure out if printk_safe will
> stay in the kernel (this will take some time).
>
> -ss
>
>


2019-02-18 14:20:04

by Hongbo Yao

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock



On 2/18/2019 1:46 PM, Sergey Senozhatsky wrote:
> Hi,
>
> On (02/16/19 15:59), Yao HongBo wrote:
>>> GFP_NOWARN is probably the best option for now. Yes, it, maybe,
>>> will not work for fault-injection cases; but printk_safe approach
>>> is harder to push for, especially given that printk_safe maybe will
>>> not even exist in the future.
>>
>> I have tried GFP_NOWARN, but the problem still exists.
>> Only print_safe contexts for tty locks can solve the problem.
>> My test scenario is falt-injection.
>
> Oh, I see. Yes, fault-injection is special.
>
> I suspect that this patch series can be helpful then
> https://lore.kernel.org/lkml/[email protected]/T/#u

hi, sergey.
I merged this patch series on linux-4.19.18, but it didn't work for the fault-injection cases.
The failure seems to be the same as before.

deadlock log:
-------
[ 193.213385] FAULT_INJECTION: forcing a failure.
[ 193.213385] name failslab, interval 1, probability 0, space 0, times 1
[ 193.216518] CPU: 1 PID: 6317 Comm: syz-executor0 Not tainted 4.19.18-514.55.6.9.x86_64+ #5
[ 193.217383] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 193.217383] Call Trace:
[ 193.217383] dump_stack+0xca/0x13e
[ 193.217383] should_fail+0x607/0x700
[ 193.217383] ? fault_create_debugfs_attr+0x1d0/0x1d0
[ 193.217383] ? __lock_is_held+0xbc/0x160
[ 193.225711] __should_failslab+0x110/0x180
[ 193.225711] should_failslab+0xa/0x20
[ 193.225711] __kmalloc+0x6e/0x350
[ 193.225711] ? tty_write+0x65e/0x8a0
[ 193.225711] tty_write+0x65e/0x8a0
[ 193.225711] ? process_echoes+0x140/0x140
[ 193.225711] ? rw_verify_area+0xe2/0x2b0
[ 193.225711] do_iter_write+0x3dc/0x580
[ 193.233550] vfs_writev+0x16c/0x300
[ 193.233550] ? vfs_iter_write+0xa0/0xa0
[ 193.233550] ? lock_downgrade+0x5e0/0x5e0
[ 193.233550] ? __lock_is_held+0xbc/0x160
[ 193.233550] ? __fget+0x336/0x540
[ 193.238589] ? do_dup2+0x3d0/0x3d0
[ 193.239291] ? __mutex_unlock_slowpath+0xe1/0x690
[ 193.239291] ? do_writev+0xd1/0x230
[ 193.239291] do_writev+0xd1/0x230
[ 193.239291] ? vfs_writev+0x300/0x300
[ 193.239291] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 193.239291] ? trace_hardirqs_off_caller+0x40/0x190
[ 193.239291] ? do_syscall_64+0x3b/0x580
[ 193.239291] do_syscall_64+0xc8/0x580
[ 193.239291] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 193.239291] RIP: 0033:0x462589
[ 193.239291] Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
[ 193.239291] RSP: 002b:00007f2a92dbcc58 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
[ 193.239291] RAX: ffffffffffffffda RBX: 000000000072bf00 RCX: 0000000000462589
[ 193.239291] RDX: 0000000000000001 RSI: 0000000020000040 RDI: 0000000000000003
[ 193.239291] RBP: 00007f2a92dbcc70 R08: 0000000000000000 R09: 0000000000000000
[ 193.239291] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2a92dbd6bc
[ 193.239291] R13: 00000000004c222e R14: 0000000000702110 R15: 0000000000000004
[ 193.266429] FAULT_INJECTION: forcing a failure.
[ 193.266429] name failslab, interval 1, probability 0, space 0, times 0
[ 193.266767] CPU: 0 PID: 6340 Comm: syz-executor13 Not tainted 4.19.18-514.55.6.9.x86_64+ #5
[ 193.266767] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 193.266767] Call Trace:
[ 193.266767] dump_stack+0xca/0x13e
[ 193.266767] should_fail+0x607/0x700
[ 193.266767] ? n_tty_write+0x96a/0xd90
[ 193.266767] ? fault_create_debugfs_attr+0x1d0/0x1d0
[ 193.266767] ? mark_held_locks+0x120/0x120
[ 193.266767] __should_failslab+0x110/0x180
[ 193.266767] should_failslab+0xa/0x20
[ 193.266767] __kmalloc+0x6e/0x350
[ 193.266767] ? __tty_buffer_request_room+0x1cf/0x5e0
[ 193.266767] __tty_buffer_request_room+0x1cf/0x5e0
[ 193.266767] tty_insert_flip_string_fixed_flag+0x8f/0x220
[ 193.266767] pty_write+0x104/0x1d0
[ 193.266767] n_tty_write+0x9a3/0xd90
[ 193.266767] ? process_echoes+0x140/0x140
[ 193.266767] ? do_wait_intr_irq+0x2a0/0x2a0
[ 193.266767] ? __might_fault+0x17c/0x1c0
[ 193.266767] tty_write+0x451/0x8a0
[ 193.266767] ? process_echoes+0x140/0x140
[ 193.266767] do_iter_write+0x3dc/0x580
[ 193.266767] vfs_writev+0x16c/0x300
[ 193.266767] ? vfs_iter_write+0xa0/0xa0
[ 193.266767] ? lock_downgrade+0x5e0/0x5e0
[ 193.266767] ? __lock_is_held+0xbc/0x160
[ 193.266767] ? __fget+0x336/0x540
[ 193.266767] ? do_dup2+0x3d0/0x3d0
[ 193.266767] ? __mutex_unlock_slowpath+0xe1/0x690
[ 193.266767] ? do_writev+0xd1/0x230
[ 193.266767] do_writev+0xd1/0x230
[ 193.266767] ? vfs_writev+0x300/0x300
[ 193.266767] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 193.266767] ? trace_hardirqs_off_caller+0x40/0x190
[ 193.266767] ? do_syscall_64+0x3b/0x580
[ 193.266767] do_syscall_64+0xc8/0x580
[ 193.266767] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 193.266767] RIP: 0033:0x462589
[ 193.266767] Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
[ 193.266767] RSP: 002b:00007f008dcc9c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
[ 193.266767] RAX: ffffffffffffffda RBX: 000000000072bf00 RCX: 0000000000462589
[ 193.266767] RDX: 0000000000000001 RSI: 0000000020000040 RDI: 0000000000000003
[ 193.266767] RBP: 00007f008dcc9c70 R08: 0000000000000000 R09: 0000000000000000
[ 193.266767] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f008dcca6bc
[ 193.266767] R13: 00000000004c222e R14: 0000000000702110 R15: 0000000000000004
[ 193.266767]
[ 193.266767] ======================================================
[ 193.266767] WARNING: possible circular locking dependency detected
[ 193.266767] 4.19.18-514.55.6.9.x86_64+ #5 Not tainted
[ 193.266767] ------------------------------------------------------
[ 193.266767] syz-executor13/6340 is trying to acquire lock:
[ 193.266767] 0000000026c0d2ca (console_owner){-.-.}, at: console_unlock+0x36d/0xb30
[ 193.266767]
[ 193.266767] but task is already holding lock:
[ 193.266767] 00000000f655eadd (&(&port->lock)->rlock){-.-.}, at: pty_write+0xd2/0x1d0
[ 193.266767]
[ 193.266767] which lock already depends on the new lock.
[ 193.266767]
[ 193.266767]
[ 193.266767] the existing dependency chain (in reverse order) is:
[ 193.266767]
[ 193.266767] -> #2 (&(&port->lock)->rlock){-.-.}:
[ 193.266767] tty_port_tty_get+0x20/0x80
[ 193.266767] tty_port_default_wakeup+0x16/0x40
[ 193.266767] serial8250_tx_chars+0x4dc/0xa80
[ 193.266767] serial8250_handle_irq.part.12+0x16b/0x270
[ 193.266767] serial8250_default_handle_irq+0xf8/0x120
[ 193.266767] serial8250_interrupt+0xfe/0x250
[ 193.266767] __handle_irq_event_percpu+0xf5/0x730
[ 193.266767] handle_irq_event_percpu+0x7b/0x170
[ 193.266767] handle_irq_event+0xa6/0x140
[ 193.266767] handle_edge_irq+0x1eb/0xa90
[ 193.266767] handle_irq+0x3e/0x50
[ 193.266767] do_IRQ+0x92/0x200
[ 193.266767] ret_from_intr+0x0/0x22
[ 193.266767] native_safe_halt+0x2/0x10
[ 193.266767] default_idle+0x24/0x2b0
[ 193.266767] do_idle+0x2ca/0x420
[ 193.266767] cpu_startup_entry+0xcb/0xe0
[ 193.266767] start_secondary+0x421/0x570
[ 193.266767] secondary_startup_64+0xa4/0xb0
[ 193.266767]
[ 193.266767] -> #1 (&port_lock_key){-.-.}:
[ 193.266767] serial8250_console_write+0x663/0x810
[ 193.266767] console_unlock+0x66a/0xb30
[ 193.266767] vprintk_emit+0x181/0x570
[ 193.266767] vprintk_default+0x68/0xe0
[ 193.266767] vprintk_func+0x57/0xf0
[ 193.266767] printk+0xb7/0xe2
[ 193.266767] register_console+0x752/0xc60
[ 193.266767] univ8250_console_init+0x31/0x3a
[ 193.266767] console_init+0x3ad/0x567
[ 193.266767] start_kernel+0x4c3/0x7e1
[ 193.266767] secondary_startup_64+0xa4/0xb0
[ 193.266767]
[ 193.266767] -> #0 (console_owner){-.-.}:
[ 193.266767] console_unlock+0x3d9/0xb30
[ 193.266767] vprintk_emit+0x181/0x570
[ 193.266767] vprintk_default+0x68/0xe0
[ 193.266767] vprintk_func+0x57/0xf0
[ 193.266767] printk+0xb7/0xe2
[ 193.266767] should_fail+0x5d3/0x700
[ 193.266767] __should_failslab+0x110/0x180
[ 193.266767] should_failslab+0xa/0x20
[ 193.266767] __kmalloc+0x6e/0x350
[ 193.266767] __tty_buffer_request_room+0x1cf/0x5e0
[ 193.266767] tty_insert_flip_string_fixed_flag+0x8f/0x220
[ 193.266767] pty_write+0x104/0x1d0
[ 193.266767] n_tty_write+0x9a3/0xd90
[ 193.266767] tty_write+0x451/0x8a0
[ 193.266767] do_iter_write+0x3dc/0x580
[ 193.266767] vfs_writev+0x16c/0x300
[ 193.266767] do_writev+0xd1/0x230
[ 193.266767] do_syscall_64+0xc8/0x580
[ 193.266767] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 193.266767]
[ 193.266767] other info that might help us debug this:
[ 193.266767]
[ 193.266767] Chain exists of:
[ 193.266767] console_owner --> &port_lock_key --> &(&port->lock)->rlock
[ 193.266767]
[ 193.266767] Possible unsafe locking scenario:
[ 193.266767]
[ 193.266767] CPU0 CPU1
[ 193.266767] ---- ----
[ 193.266767] lock(&(&port->lock)->rlock);
[ 193.266767] lock(&port_lock_key);
[ 193.266767] lock(&(&port->lock)->rlock);
[ 193.266767] lock(console_owner);
[ 193.266767]
[ 193.266767] *** DEADLOCK ***




> but first we need to figure out if printk_safe will
> stay in the kernel (this will take some time).
>
> -ss
>
>


2019-02-19 02:49:26

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock

On (02/18/19 22:07), Yao HongBo wrote:
> >> I have tried GFP_NOWARN, but the problem still exists.
> >> Only print_safe contexts for tty locks can solve the problem.
> >> My test scenario is falt-injection.
> >
> > Oh, I see. Yes, fault-injection is special.
> >
> > I suspect that this patch series can be helpful then
> > https://lore.kernel.org/lkml/[email protected]/T/#u
>
> hi, sergey.

Hello,

> I merged this patch series on linux-4.19.18, but it didn't work for the fault-injection cases.

Thanks!

> The failure seems to be the same as before.

OK... So tty_port lock must switch to printk_safe, after all...
I had it in one of the previous versions of the patchset which you
have tested, but people were strictly against new locking rules
in TTY, so I dropped that part. Need to think what we can do here.

BTW,
we are now looking at a completely new printk implementation; which
would not use printk_safe at all:
https://lore.kernel.org/lkml/[email protected]/T/#u

-ss

2019-02-19 02:53:14

by Hongbo Yao

[permalink] [raw]
Subject: Re: possible deadlock in console_unlock



On 2/19/2019 9:32 AM, Sergey Senozhatsky wrote:
> On (02/18/19 22:07), Yao HongBo wrote:
>>>> I have tried GFP_NOWARN, but the problem still exists.
>>>> Only print_safe contexts for tty locks can solve the problem.
>>>> My test scenario is falt-injection.
>>>
>>> Oh, I see. Yes, fault-injection is special.
>>>
>>> I suspect that this patch series can be helpful then
>>> https://lore.kernel.org/lkml/[email protected]/T/#u
>>
>> hi, sergey.
>
> Hello,
>
>> I merged this patch series on linux-4.19.18, but it didn't work for the fault-injection cases.
>
> Thanks!
>
>> The failure seems to be the same as before.
>
> OK... So tty_port lock must switch to printk_safe, after all...
> I had it in one of the previous versions of the patchset which you
> have tested, but people were strictly against new locking rules
> in TTY, so I dropped that part. Need to think what we can do here.
>
> BTW,
> we are now looking at a completely new printk implementation; which
> would not use printk_safe at all:
> https://lore.kernel.org/lkml/[email protected]/T/#u

Ok, i understand it.
Anyway, thank you for your help.
Best regards,
Hongbo.

> -ss
>
>