Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1948417AbdDYOu2 (ORCPT ); Tue, 25 Apr 2017 10:50:28 -0400 Received: from mfb01-md.ns.itscom.net ([175.177.155.109]:51157 "EHLO mfb01-md.ns.itscom.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1948369AbdDYOuU (ORCPT ); Tue, 25 Apr 2017 10:50:20 -0400 X-Greylist: delayed 408 seconds by postgrey-1.27 at vger.kernel.org; Tue, 25 Apr 2017 10:50:19 EDT From: "J. R. Okajima" To: linux-kernel@vger.kernel.org Subject: v4.11-rc7, deadlock? workqueue pool->lock and serial8250 port_lock_key Date: Tue, 25 Apr 2017 23:43:29 +0900 Message-ID: <20675.1493131409@jrobl> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6612 Lines: 190 Hello, I've found this kernel log from v4.11.0-rc7, and all(?) workqueues seem stopped working. The system is using a serial console by the kernel command line root=3D/dev/sda8 ro console=3DttyS0,115200 console=3Dtty0 mem=3D1G Does it mean that show_workqueue_state() on a serial console (almost?) always cause a deadlock? Who I can ask to check this log? I am not sure whether this is related, but I added a workqueue for my development. There is no such message in the log about my workqueue, but I am afraid my development is related to any change around workqueue. J. R. Okajima =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D [ INFO: possible circular locking dependency detected ] 4.11.0-rc7aufsD+ #114 Not tainted ------------------------------------------------------- dockerd/777 is trying to acquire lock: (&port_lock_key){-.-.-.}, at: [] serial8250_console_wri= te+0x1ad/0x570 but task is already holding lock: (&(&pool->lock)->rlock){-.-.-.}, at: [] show_workqueue_= state+0x1e0/0xd90 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&(&pool->lock)->rlock){-.-.-.}: lock_acquire+0xd5/0x380 _raw_spin_lock+0x32/0x50 __queue_work+0x143/0x1090 queue_work_on+0x38/0x80 put_pwq+0x152/0x2a0 put_pwq_unlocked+0x22/0x30 destroy_workqueue+0x3bd/0x680 nfs4_state_shutdown+0x15/0x30 [nfsd] nfsd_shutdown_generic+0x26/0x50 [nfsd] nfsd_shutdown_net+0x135/0x2b0 [nfsd] nfsd_last_thread+0x21e/0x2d0 [nfsd] svc_shutdown_net+0x4c/0xa0 nfsd_destroy+0x1b3/0x2f0 [nfsd] nfsd+0x28f/0x480 [nfsd] kthread+0x1a0/0x2d0 ret_from_fork+0x2e/0x40 -> #1 (&pool->lock/1){-.-.-.}: lock_acquire+0xd5/0x380 _raw_spin_lock+0x32/0x50 __queue_work+0x143/0x1090 queue_work_on+0x38/0x80 tty_schedule_flip+0x5b/0xb0 tty_flip_buffer_push+0xe/0x10 serial8250_rx_chars+0x19c/0x580 serial8250_handle_irq.part.17+0xd5/0x190 serial8250_default_handle_irq+0x5c/0xa0 serial8250_interrupt+0x83/0x160 __handle_irq_event_percpu+0x73/0x8f0 handle_irq_event_percpu+0x25/0x90 handle_irq_event+0x7d/0x160 handle_edge_irq+0x139/0x9e0 handle_irq+0xa1/0x400 do_IRQ+0x82/0x190 ret_from_intr+0x0/0x19 cpuidle_enter_state+0x150/0x820 cpuidle_enter+0x17/0x20 do_idle+0x25a/0x480 cpu_startup_entry+0x70/0x90 rest_init+0x15e/0x170 start_kernel+0x4e1/0x4ee x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0xe7/0xee verify_cpu+0x0/0xfc -> #0 (&port_lock_key){-.-.-.}: __lock_acquire+0x1fb7/0x2e00 lock_acquire+0xd5/0x380 _raw_spin_lock_irqsave+0x3d/0x60 serial8250_console_write+0x1ad/0x570 univ8250_console_write+0x36/0x60 console_unlock+0x59b/0xa00 vprintk_emit+0x241/0x430 vprintk_default+0x1f/0x30 vprintk_func+0x2c/0x80 printk+0x43/0x4b show_workqueue_state+0x2a1/0xd90 wq_watchdog_timer_fn+0x245/0x3e0 call_timer_fn+0xd4/0x760 run_timer_softirq+0x541/0xbb0 __do_softirq+0x117/0xcb0 irq_exit+0xf9/0x170 smp_apic_timer_interrupt+0x44/0x70 apic_timer_interrupt+0x90/0xa0 read_hpet+0x107/0x120 ktime_get_ts64+0x115/0x460 poll_select_copy_remaining+0x7e/0x240 SyS_select+0x117/0x160 entry_SYSCALL_64_fastpath+0x18/0xad other info that might help us debug this: Chain exists of: &port_lock_key --> &pool->lock/1 --> &(&pool->lock)->rlock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&pool->lock)->rlock); lock(&pool->lock/1); lock(&(&pool->lock)->rlock); lock(&port_lock_key); *** DEADLOCK *** 4 locks held by dockerd/777: #0: (/proj/aufs/aufs4-linux.git/kernel/workqueue.c:5320){+.-.-.}, at: [<= ffffffff811d278b>] call_timer_fn+0x3b/0x760 #1: (rcu_read_lock_sched){......}, at: [] show_workque= ue_state+0x5/0xd90 #2: (&(&pool->lock)->rlock){-.-.-.}, at: [] show_workq= ueue_state+0x1e0/0xd90 #3: (console_lock){+.+.+.}, at: [] vprintk_emit+0x234/= 0x430 stack backtrace: CPU: 0 PID: 777 Comm: dockerd Not tainted 4.11.0-rc7aufsD+ #114 Hardware name: Pegatron Pegatron/IPM41, BIOS 0001 02/05/2009 Call Trace: dump_stack+0x68/0x9d print_circular_bug+0x2a6/0x2b7 __lock_acquire+0x1fb7/0x2e00 lock_acquire+0xd5/0x380 ? lock_acquire+0xd5/0x380 ? serial8250_console_write+0x1ad/0x570 _raw_spin_lock_irqsave+0x3d/0x60 ? serial8250_console_write+0x1ad/0x570 serial8250_console_write+0x1ad/0x570 ? _raw_spin_unlock+0x69/0x80 univ8250_console_write+0x36/0x60 console_unlock+0x59b/0xa00 ? __down_trylock_console_sem+0x68/0x80 ? vprintk_emit+0x234/0x430 vprintk_emit+0x241/0x430 vprintk_default+0x1f/0x30 vprintk_func+0x2c/0x80 printk+0x43/0x4b show_workqueue_state+0x2a1/0xd90 ? check_preemption_disabled+0x40/0x160 wq_watchdog_timer_fn+0x245/0x3e0 ? destroy_workqueue+0x680/0x680 call_timer_fn+0xd4/0x760 ? destroy_workqueue+0x680/0x680 run_timer_softirq+0x541/0xbb0 __do_softirq+0x117/0xcb0 irq_exit+0xf9/0x170 smp_apic_timer_interrupt+0x44/0x70 apic_timer_interrupt+0x90/0xa0 RIP: 0010:read_hpet+0x107/0x120 RSP: 0018:ffff88001ed5bdf8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 RAX: 000000000052aa86 RBX: 0000000000000246 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff880019c34a00 RBP: ffff88001ed5be20 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000091838cd1 R13: 0000000001594bf8 R14: ffff88001ed5bf00 R15: ffff88001ed5be88 ? hpet_interrupt_handler+0x80/0x80 ktime_get_ts64+0x115/0x460 poll_select_copy_remaining+0x7e/0x240 ? poll_select_set_timeout+0x91/0xf0 SyS_select+0x117/0x160 entry_SYSCALL_64_fastpath+0x18/0xad RIP: 0033:0x472017 RSP: 002b:00007f9eb391ece8 EFLAGS: 00000246 ORIG_RAX: 0000000000000017 RAX: ffffffffffffffda RBX: 000000000000006e RCX: 0000000000472017 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000046 R08: 00007f9eb391ece8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00000000000000f3 R14: 0000000000000030 R15: 0000000000000003