Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936907AbdCXQ6C (ORCPT ); Fri, 24 Mar 2017 12:58:02 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:46100 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756692AbdCXQ5y (ORCPT ); Fri, 24 Mar 2017 12:57:54 -0400 Date: Fri, 24 Mar 2017 17:57:45 +0100 (CET) From: Sebastian Ott X-X-Sender: sebott@schleppi.fritz.box To: Steven Rostedt cc: Sergey Senozhatsky , Petr Mladek , Heiko Carstens , linux-kernel@vger.kernel.org Subject: Re: lockdep warning: console vs. mem hotplug In-Reply-To: <20170324123933.78f9c125@gandalf.local.home> References: <20170324123933.78f9c125@gandalf.local.home> User-Agent: Alpine 2.20 (LFD 67 2015-01-07) Organization: =?ISO-8859-15?Q?=22IBM_Deutschland_Research_&_Development_GmbH_=2F_Vorsitzende_des_Aufsichtsrats=3A_Martina_Koederitz_Gesch=E4ftsf=FChrung=3A_Dirk_Wittkopp_Sitz_der_Gesellschaft=3A_B=F6blingen_=2F_Registergericht?= =?ISO-8859-15?Q?=3A_Amtsgericht_Stuttgart=2C_HRB_243294=22?= MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-TM-AS-GCONF: 00 x-cbid: 17032416-0020-0000-0000-0000032E3754 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17032416-0021-0000-0000-000040F33915 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-03-24_15:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=3 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1703240143 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 16910 Lines: 312 On Fri, 24 Mar 2017, Steven Rostedt wrote: > Is there a stack trace of where the lockdep dump happened? That is > useful too. Otherwise we don't see where the inverse happened. Sure. Here you go. [ 132.379220] ====================================================== [ 132.379220] [ INFO: possible circular locking dependency detected ] [ 132.379221] 4.11.0-rc3 #330 Not tainted [ 132.379221] ------------------------------------------------------- [ 132.379222] sh/708 is trying to acquire lock: [ 132.379222] (&(&sclp_con_lock)->rlock){-.-...}, at: [<000000000074524a>] sclp_console_write+0x4a/0x2b8 [ 132.379224] [ 132.379224] but task is already holding lock: [ 132.379224] (&(&zone->lock)->rlock){..-...}, at: [<00000000002a65a8>] __offline_isolated_pages+0xe0/0x340 [ 132.379226] [ 132.379227] which lock already depends on the new lock. [ 132.379227] [ 132.379227] [ 132.379228] the existing dependency chain (in reverse order) is: [ 132.379228] [ 132.379229] -> #2 (&(&zone->lock)->rlock){..-...}: [ 132.379230] validate_chain.isra.22+0xb36/0xd18 [ 132.379231] __lock_acquire+0x650/0x880 [ 132.379231] lock_acquire+0x24e/0x2b0 [ 132.379232] _raw_spin_lock_irqsave+0x70/0xb8 [ 132.379232] get_page_from_freelist+0x548/0xf30 [ 132.379233] __alloc_pages_nodemask+0x1c8/0x14c0 [ 132.379233] allocate_slab+0xde/0x668 [ 132.379233] new_slab+0x94/0xa8 [ 132.379234] ___slab_alloc.constprop.66+0x556/0x578 [ 132.379234] __slab_alloc.isra.62.constprop.65+0x74/0xa8 [ 132.379235] kmem_cache_alloc+0x136/0x4b8 [ 132.379235] __debug_object_init+0xaa/0x448 [ 132.379236] debug_object_activate+0x182/0x238 [ 132.379236] mod_timer+0xe6/0x450 [ 132.379236] prandom_reseed+0x44/0x50 [ 132.379237] do_one_initcall+0xa2/0x150 [ 132.379237] kernel_init_freeable+0x228/0x2d8 [ 132.379238] kernel_init+0x2a/0x138 [ 132.379238] kernel_thread_starter+0x6/0xc [ 132.379239] kernel_thread_starter+0x0/0xc [ 132.379239] [ 132.379239] -> #1 (&(&base->lock)->rlock){-.-.-.}: [ 132.379241] validate_chain.isra.22+0xb36/0xd18 [ 132.379242] __lock_acquire+0x650/0x880 [ 132.379242] lock_acquire+0x24e/0x2b0 [ 132.379242] _raw_spin_lock_irqsave+0x70/0xb8 [ 132.379243] lock_timer_base+0x78/0xa8 [ 132.379243] mod_timer+0xba/0x450 [ 132.379244] sclp_console_write+0x262/0x2b8 [ 132.379244] console_unlock+0x4a6/0x658 [ 132.379245] register_console+0x31c/0x430 [ 132.379245] sclp_console_init+0x152/0x170 [ 132.379245] console_init+0x3a/0x58 [ 132.379246] start_kernel+0x350/0x480 [ 132.379246] _stext+0x20/0x80 [ 132.379246] [ 132.379247] -> #0 (&(&sclp_con_lock)->rlock){-.-...}: [ 132.379249] check_prev_add+0x150/0x648 [ 132.379249] validate_chain.isra.22+0xb36/0xd18 [ 132.379249] __lock_acquire+0x650/0x880 [ 132.379250] lock_acquire+0x24e/0x2b0 [ 132.379250] _raw_spin_lock_irqsave+0x70/0xb8 [ 132.379251] sclp_console_write+0x4a/0x2b8 [ 132.379251] console_unlock+0x4a6/0x658 [ 132.379252] vprintk_emit+0x33e/0x350 [ 132.379252] vprintk_default+0x44/0x58 [ 132.379252] printk+0x4e/0x58 [ 132.379253] __offline_isolated_pages+0x1a0/0x340 [ 132.379253] offline_isolated_pages_cb+0x2e/0x40 [ 132.379254] walk_system_ram_range+0x96/0xe8 [ 132.379254] __offline_pages.constprop.43+0x6da/0x918 [ 132.379255] memory_subsys_offline+0x6a/0xa0 [ 132.379255] device_offline+0x84/0xd8 [ 132.379255] store_mem_state+0xfe/0x120 [ 132.379256] kernfs_fop_write+0x126/0x1d0 [ 132.379256] __vfs_write+0x46/0x140 [ 132.379257] vfs_write+0xb8/0x1a8 [ 132.379257] SyS_write+0x66/0xc0 [ 132.379257] system_call+0xc4/0x240 [ 132.379258] [ 132.379258] other info that might help us debug this: [ 132.379258] [ 132.379259] Chain exists of: [ 132.379259] &(&sclp_con_lock)->rlock --> &(&base->lock)->rlock --> &(&zone->lock)->rlock [ 132.379262] [ 132.379262] Possible unsafe locking scenario: [ 132.379262] [ 132.379263] CPU0 CPU1 [ 132.379263] ---- ---- [ 132.379264] lock(&(&zone->lock)->rlock); [ 132.379265] lock(&(&base->lock)->rlock); [ 132.379266] lock(&(&zone->lock)->rlock); [ 132.379267] lock(&(&sclp_con_lock)->rlock); [ 132.379268] [ 132.379268] *** DEADLOCK *** [ 132.379269] [ 132.379269] 10 locks held by sh/708: [ 132.379269] #0: (sb_writers#4){.+.+.+}, at: [<0000000000345d4a>] vfs_write+0xa2/0x1a8 [ 132.379272] #1: (&of->mutex){+.+.+.}, at: [<00000000003e5e60>] kernfs_fop_write+0xd8/0x1d0 [ 132.379273] #2: (s_active#180){.+.+.+}, at: [<00000000003e5e6c>] kernfs_fop_write+0xe4/0x1d0 [ 132.379275] #3: (device_hotplug_lock){+.+...}, at: [<000000000069bb58>] lock_device_hotplug_sysfs+0x30/0x68 [ 132.379277] #4: (memory_add_remove_lock){+.+...}, at: [<000000000031cfe6>] mem_hotplug_begin+0x3e/0xd0 [ 132.379279] #5: (mem_hotplug.lock){++++.+}, at: [<000000000031cfa8>] mem_hotplug_begin+0x0/0xd0 [ 132.379281] #6: (mem_hotplug.lock#2){+.+...}, at: [<000000000031d02c>] mem_hotplug_begin+0x84/0xd0 [ 132.379283] #7: (&dev->mutex){......}, at: [<000000000069d55c>] device_offline+0x5c/0xd8 [ 132.379285] #8: (&(&zone->lock)->rlock){..-...}, at: [<00000000002a65a8>] __offline_isolated_pages+0xe0/0x340 [ 132.379287] #9: (console_lock){+.+...}, at: [<00000000001bdc02>] vprintk_emit+0x332/0x350 [ 132.379289] [ 132.379289] stack backtrace: [ 132.379290] CPU: 1 PID: 708 Comm: sh Not tainted 4.11.0-rc3 #330 [ 132.379290] Hardware name: IBM 2827 H66 703 (LPAR) [ 132.379290] Stack: [ 132.379291] 000000006f5ab458 000000006f5ab4e8 0000000000000003 0000000000000000 [ 132.379293] 000000006f5ab588 000000006f5ab500 000000006f5ab500 0000000000000020 [ 132.379294] 0000000000929180 000000000000000a 0000000000000020 000000000000000a [ 132.379296] 000000000000000c 000000006f5ab550 0000000000000000 0000000000000000 [ 132.379298] 04000000016e0b18 0000000000111da2 000000006f5ab4e8 000000006f5ab540 [ 132.379300] Call Trace: [ 132.379300] ([<0000000000111c86>] show_trace+0x8e/0xe0) [ 132.379301] [<0000000000111d58>] show_stack+0x80/0xd8 [ 132.379301] [<00000000005ff656>] dump_stack+0x96/0xd8 [ 132.379302] [<0000000000291572>] print_circular_bug+0x2f2/0x308 [ 132.379302] [<00000000001a6b98>] check_prev_add+0x150/0x648 [ 132.379303] [<00000000001a7bc6>] validate_chain.isra.22+0xb36/0xd18 [ 132.379303] [<00000000001a9240>] __lock_acquire+0x650/0x880 [ 132.379304] [<00000000001a9bee>] lock_acquire+0x24e/0x2b0 [ 132.379304] [<00000000008dcab0>] _raw_spin_lock_irqsave+0x70/0xb8 [ 132.379304] [<000000000074524a>] sclp_console_write+0x4a/0x2b8 [ 132.379305] [<00000000001bd71e>] console_unlock+0x4a6/0x658 [ 132.379305] [<00000000001bdc0e>] vprintk_emit+0x33e/0x350 [ 132.379306] [<00000000001bddec>] vprintk_default+0x44/0x58 [ 132.379306] [<0000000000291c8e>] printk+0x4e/0x58 [ 132.379307] [<00000000002a6668>] __offline_isolated_pages+0x1a0/0x340 [ 132.379307] [<000000000031cbbe>] offline_isolated_pages_cb+0x2e/0x40 [ 132.379308] [<0000000000147466>] walk_system_ram_range+0x96/0xe8 [ 132.379308] [<00000000008d2bba>] __offline_pages.constprop.43+0x6da/0x918 [ 132.379309] [<00000000006b9b2a>] memory_subsys_offline+0x6a/0xa0 [ 132.379309] [<000000000069d584>] device_offline+0x84/0xd8 [ 132.379310] [<00000000006ba146>] store_mem_state+0xfe/0x120 [ 132.379310] [<00000000003e5eae>] kernfs_fop_write+0x126/0x1d0 [ 132.379311] [<0000000000344b0e>] __vfs_write+0x46/0x140 [ 132.379311] [<0000000000345d60>] vfs_write+0xb8/0x1a8 [ 132.379312] [<000000000034742e>] SyS_write+0x66/0xc0 [ 132.379312] [<00000000008dda2c>] system_call+0xc4/0x240 [ 132.379313] INFO: lockdep is turned off. [ 377.096545] ====================================================== [ 377.096545] [ INFO: possible circular locking dependency detected ] [ 377.096546] 4.11.0-rc2-00238-ga722844 #120 Not tainted [ 377.096546] ------------------------------------------------------- [ 377.096547] sh/848 is trying to acquire lock: [ 377.096547] ((console_sem).lock){-.....}, at: [<00000000001a594e>] down_trylock+0x2e/0x68 [ 377.096549] but task is already holding lock: [ 377.096550] (&(&zone->lock)->rlock){..-...}, at: [<00000000002ae3da>] __offline_isolated_pages+0x302/0x370 [ 377.096552] which lock already depends on the new lock. [ 377.096553] the existing dependency chain (in reverse order) is: [ 377.096554] -> #3 (&(&zone->lock)->rlock){..-...}: [ 377.096556] validate_chain.isra.10+0xb7a/0xda8 [ 377.096557] __lock_acquire+0x62c/0x850 [ 377.096557] lock_acquire+0x254/0x2b8 [ 377.096558] _raw_spin_lock_irqsave+0x70/0xb8 [ 377.096558] get_page_from_freelist+0x4e2/0xf28 [ 377.096558] __alloc_pages_nodemask+0x1fa/0x1528 [ 377.096559] allocate_slab+0xf2/0x6c0 [ 377.096559] new_slab+0x94/0xa8 [ 377.096560] ___slab_alloc.constprop.22+0x570/0x590 [ 377.096560] __slab_alloc.isra.16.constprop.21+0x74/0xa8 [ 377.096561] kmem_cache_alloc+0x13c/0x4b0 [ 377.096561] __debug_object_init+0x5c/0x468 [ 377.096561] hrtimer_init+0x42/0x1d8 [ 377.096562] init_dl_task_timer+0x3a/0x58 [ 377.096562] __sched_fork.isra.5+0x7e/0xe8 [ 377.096563] init_idle+0x7a/0x278 [ 377.096563] fork_idle+0xa4/0xb8 [ 377.096564] idle_threads_init+0x64/0xc8 [ 377.096564] smp_init+0x2e/0x110 [ 377.096565] kernel_init_freeable+0x16c/0x2e0 [ 377.096565] kernel_init+0x2a/0x148 [ 377.096565] kernel_thread_starter+0x6/0xc [ 377.096566] kernel_thread_starter+0x0/0xc [ 377.096567] -> #2 (&rq->lock){-.-.-.}: [ 377.096568] validate_chain.isra.10+0xb7a/0xda8 [ 377.096569] __lock_acquire+0x62c/0x850 [ 377.096569] lock_acquire+0x254/0x2b8 [ 377.096569] _raw_spin_lock+0x60/0xa0 [ 377.096570] task_fork_fair+0x54/0x118 [ 377.096570] sched_fork+0x13e/0x2a0 [ 377.096571] copy_process+0x642/0x1e80 [ 377.096571] _do_fork+0xc2/0x6d0 [ 377.096571] kernel_thread+0x4e/0x60 [ 377.096572] rest_init+0x42/0x170 [ 377.096572] start_kernel+0x464/0x478 [ 377.096572] _stext+0x20/0x80 [ 377.096573] -> #1 (&p->pi_lock){-.-.-.}: [ 377.096574] validate_chain.isra.10+0xb7a/0xda8 [ 377.096575] __lock_acquire+0x62c/0x850 [ 377.096575] lock_acquire+0x254/0x2b8 [ 377.096576] _raw_spin_lock_irqsave+0x70/0xb8 [ 377.096576] try_to_wake_up+0x4a/0x5f8 [ 377.096576] up+0x58/0x70 [ 377.096577] __up_console_sem+0x5a/0x90 [ 377.096577] console_unlock+0x610/0x670 [ 377.096578] vprintk_emit+0x308/0x378 [ 377.096578] dev_vprintk_emit+0x1e6/0x200 [ 377.096578] dev_printk_emit+0x4e/0x60 [ 377.096579] __dev_printk+0x8e/0xb0 [ 377.096579] dev_warn+0x68/0x78 [ 377.096579] qeth_dev_portname_store+0x58/0x70 [ 377.096580] kernfs_fop_write+0x124/0x1c8 [ 377.096580] __vfs_write+0x36/0x140 [ 377.096580] vfs_write+0xb8/0x1a0 [ 377.096581] SyS_write+0x66/0xc0 [ 377.096581] system_call+0xc4/0x240 [ 377.096582] -> #0 ((console_sem).lock){-.....}: [ 377.096583] check_prev_add+0x160/0x6e8 [ 377.096584] validate_chain.isra.10+0xb7a/0xda8 [ 377.096584] __lock_acquire+0x62c/0x850 [ 377.096584] lock_acquire+0x254/0x2b8 [ 377.096585] _raw_spin_lock_irqsave+0x70/0xb8 [ 377.096585] down_trylock+0x2e/0x68 [ 377.096586] __down_trylock_console_sem+0x4a/0xc0 [ 377.096586] console_trylock+0x32/0xc8 [ 377.096586] vprintk_emit+0x2fc/0x378 [ 377.096587] vprintk_default+0x44/0x58 [ 377.096587] printk+0x4e/0x60 [ 377.096587] __offline_isolated_pages+0x168/0x370 [ 377.096588] offline_isolated_pages_cb+0x2e/0x40 [ 377.096588] walk_system_ram_range+0x92/0xf0 [ 377.096589] __offline_pages.constprop.9+0x74a/0x988 [ 377.096589] memory_subsys_offline+0x6c/0xa0 [ 377.096589] device_offline+0x84/0xe0 [ 377.096590] store_mem_state+0xfe/0x120 [ 377.096590] kernfs_fop_write+0x124/0x1c8 [ 377.096590] __vfs_write+0x36/0x140 [ 377.096591] vfs_write+0xb8/0x1a0 [ 377.096591] SyS_write+0x66/0xc0 [ 377.096592] system_call+0xc4/0x240 [ 377.096592] other info that might help us debug this: [ 377.096593] Chain exists of: [ 377.096593] (console_sem).lock --> &rq->lock --> &(&zone->lock)->rlock [ 377.096595] Possible unsafe locking scenario: [ 377.096596] CPU0 CPU1 [ 377.096596] ---- ---- [ 377.096597] lock(&(&zone->lock)->rlock); [ 377.096598] lock(&rq->lock); [ 377.096599] lock(&(&zone->lock)->rlock); [ 377.096600] lock((console_sem).lock); [ 377.096601] *** DEADLOCK *** [ 377.096601] 9 locks held by sh/848: [ 377.096602] #0: (sb_writers#5){.+.+.+}, at: [<000000000034f9c2>] vfs_write+0xa2/0x1a0 [ 377.096604] #1: (&of->mutex){+.+.+.}, at: [<00000000003f1afe>] kernfs_fop_write+0xd6/0x1c8 [ 377.096605] #2: (s_active#51){.+.+.+}, at: [<00000000003f1b0a>] kernfs_fop_write+0xe2/0x1c8 [ 377.096607] #3: (device_hotplug_lock){+.+.+.}, at: [<00000000006adb70>] lock_device_hotplug_sysfs+0x30/0x70 [ 377.096609] #4: (memory_add_remove_lock){+.+.+.}, at: [<0000000000328e16>] mem_hotplug_begin+0x3e/0xd0 [ 377.096610] #5: (mem_hotplug.lock){++++++}, at: [<0000000000328dd8>] mem_hotplug_begin+0x0/0xd0 [ 377.096612] #6: (mem_hotplug.lock#2){+.+.+.}, at: [<0000000000328e5c>] mem_hotplug_begin+0x84/0xd0 [ 377.096614] #7: (&dev->mutex){......}, at: [<00000000006af41c>] device_offline+0x5c/0xe0 [ 377.096615] #8: (&(&zone->lock)->rlock){..-...}, at: [<00000000002ae3da>] __offline_isolated_pages+0x302/0x370 [ 377.096617] stack backtrace: [ 377.096618] CPU: 0 PID: 848 Comm: sh Not tainted 4.11.0-rc2-00238-ga722844 #120 [ 377.096618] Hardware name: IBM 2827 H66 703 (z/VM) [ 377.096619] Stack: [ 377.096619] 000000003ed0f450 000000003ed0f4e0 0000000000000003 0000000000000000 [ 377.096620] 000000003ed0f580 000000003ed0f4f8 000000003ed0f4f8 0000000000000020 [ 377.096629] 00000000009441c0 0000000000000020 000000000000000a 000000000000000a [ 377.096631] 000000000000000c 000000003ed0f548 0000000000000000 0000000000000000 [ 377.096633] 04000000016fa078 0000000000112338 000000003ed0f4e0 000000003ed0f538 [ 377.096635] Call Trace: [ 377.096635] ([<0000000000112212>] show_trace+0x8a/0xe0) [ 377.096636] [<00000000001122e8>] show_stack+0x80/0xd8 [ 377.096636] [<000000000060f5a6>] dump_stack+0x96/0xd8 [ 377.096637] [<00000000001a9d3c>] print_circular_bug+0x314/0x340 [ 377.096637] [<00000000001aacc0>] check_prev_add+0x160/0x6e8 [ 377.096638] [<00000000001abdc2>] validate_chain.isra.10+0xb7a/0xda8 [ 377.096638] [<00000000001ad7dc>] __lock_acquire+0x62c/0x850 [ 377.096639] [<00000000001ae20c>] lock_acquire+0x254/0x2b8 [ 377.096639] [<00000000008f63b8>] _raw_spin_lock_irqsave+0x70/0xb8 [ 377.096639] [<00000000001a594e>] down_trylock+0x2e/0x68 [ 377.096640] [<00000000001c1262>] __down_trylock_console_sem+0x4a/0xc0 [ 377.096641] [<00000000001c130a>] console_trylock+0x32/0xc8 [ 377.096641] [<00000000001c2c84>] vprintk_emit+0x2fc/0x378 [ 377.096641] [<00000000001c2edc>] vprintk_default+0x44/0x58 [ 377.096642] [<000000000029a24e>] printk+0x4e/0x60 [ 377.096642] [<00000000002ae240>] __offline_isolated_pages+0x168/0x370 [ 377.096643] [<00000000003289d6>] offline_isolated_pages_cb+0x2e/0x40 [ 377.096643] [<00000000001492fa>] walk_system_ram_range+0x92/0xf0 [ 377.096644] [<00000000008ec1fa>] __offline_pages.constprop.9+0x74a/0x988 [ 377.096644] [<00000000006cc9bc>] memory_subsys_offline+0x6c/0xa0 [ 377.096645] [<00000000006af444>] device_offline+0x84/0xe0 [ 377.096645] [<00000000006cd0ae>] store_mem_state+0xfe/0x120 [ 377.096645] [<00000000003f1b4c>] kernfs_fop_write+0x124/0x1c8 [ 377.096646] [<000000000034e736>] __vfs_write+0x36/0x140 [ 377.096646] [<000000000034f9d8>] vfs_write+0xb8/0x1a0 [ 377.096647] [<000000000035110e>] SyS_write+0x66/0xc0 [ 377.096647] [<00000000008f74ac>] system_call+0xc4/0x240 [ 377.096648] INFO: lockdep is turned off.