Received: by 10.192.165.148 with SMTP id m20csp2477553imm; Sun, 22 Apr 2018 07:52:43 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/pAAFX+/I1FpvEEQUHnFEDuxY+UcIH5zYa/nfgllijQa7ThH+9JCRz1ToBpmM6lQHnRAKe X-Received: by 10.99.121.76 with SMTP id u73mr14513786pgc.380.1524408763039; Sun, 22 Apr 2018 07:52:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524408762; cv=none; d=google.com; s=arc-20160816; b=Ud7M/V7IZ9eVjQAQa4A8CwHb6GX+zpF+QV8F/q9kJHGjn8a4UEEP1UkEgDDPg9FS+f Y/RMHyXgv4bFbF82S1dJ3BBU29BZD9Od1fvD7PM23whSLvy/6iNasOO2yAOpIJkf+my/ VLWlxz68D6nxrs4mAdwyvrMVCdfaKZeB5C8kWRwU1dDxXpY7s/BuM+XTKhu+GK+jQFOE 8wDHFZ1k47iewLfqfq35RYF3fFPZSQGlt7uLiCxSBOBoHrpNuTuRiWZGZ9OyNiRMDahu sqaAjHEyUZbygpoe8V83kTZM0CDxm+Qvro4QgWG+8PC+COXJf4EYhHhEctdvx2ni8sBb C5PQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=tmFjeXqPC/L9/uwA6oG5Ze8sf51ZXMHHldVaj2XM5l0=; b=BjqV/GMpiV0vBgKPIPlPSJcUf5Y5scOnuUkXi94MAY61EBqwDaYymqnLWPiECI/mBW 8Bz/UB1uWX23q0gqkWckjyjf66i38e1YiViDNjGDd4QttRzjJ3n5yFN78QpBPPwCDqU1 +GS5SHUnkoVWT/UWKeuMx3A2D4a+/ooHd4mcS4Q7ar7udWr/4wsnMKPeuxnjP7qVTZtj h5vNfIiPF8Yk7ZYDyFOycE4h8yurVmESlgPLzm3rmXiQQIoGE6gjHIz1pVkBs0sExeHG Koo6g6xq774TLGCAr8k3r7GpyYAFX6Of9pcTFc3/ygMYeqeu93YLYOHf0xvQ8oiUhuc3 9chA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l12si8248821pgf.113.2018.04.22.07.52.28; Sun, 22 Apr 2018 07:52:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757284AbeDVOvT (ORCPT + 99 others); Sun, 22 Apr 2018 10:51:19 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:56694 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932117AbeDVOPG (ORCPT ); Sun, 22 Apr 2018 10:15:06 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 61B414A5; Sun, 22 Apr 2018 14:15:05 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jens Axboe , "Peter Zijlstra (Intel)" , Thomas Gleixner , Wanpeng Li , Jens Axboe , Thierry Escande Subject: [PATCH 4.9 95/95] block/mq: fix potential deadlock during cpu hotplug Date: Sun, 22 Apr 2018 15:54:04 +0200 Message-Id: <20180422135214.306911977@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180422135210.432103639@linuxfoundation.org> References: <20180422135210.432103639@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: Wanpeng Li commit 51d638b1f56a0bfd9219800620994794a1a2b219 upstream. This can be triggered by hot-unplug one cpu. ====================================================== [ INFO: possible circular locking dependency detected ] 4.11.0+ #17 Not tainted ------------------------------------------------------- step_after_susp/2640 is trying to acquire lock: (all_q_mutex){+.+...}, at: [] blk_mq_queue_reinit_work+0x18/0x110 but task is already holding lock: (cpu_hotplug.lock){+.+.+.}, at: [] cpu_hotplug_begin+0x7f/0xe0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (cpu_hotplug.lock){+.+.+.}: lock_acquire+0x11c/0x230 __mutex_lock+0x92/0x990 mutex_lock_nested+0x1b/0x20 get_online_cpus+0x64/0x80 blk_mq_init_allocated_queue+0x3a0/0x4e0 blk_mq_init_queue+0x3a/0x60 loop_add+0xe5/0x280 loop_init+0x124/0x177 do_one_initcall+0x53/0x1c0 kernel_init_freeable+0x1e3/0x27f kernel_init+0xe/0x100 ret_from_fork+0x31/0x40 -> #0 (all_q_mutex){+.+...}: __lock_acquire+0x189a/0x18a0 lock_acquire+0x11c/0x230 __mutex_lock+0x92/0x990 mutex_lock_nested+0x1b/0x20 blk_mq_queue_reinit_work+0x18/0x110 blk_mq_queue_reinit_dead+0x1c/0x20 cpuhp_invoke_callback+0x1f2/0x810 cpuhp_down_callbacks+0x42/0x80 _cpu_down+0xb2/0xe0 freeze_secondary_cpus+0xb6/0x390 suspend_devices_and_enter+0x3b3/0xa40 pm_suspend+0x129/0x490 state_store+0x82/0xf0 kobj_attr_store+0xf/0x20 sysfs_kf_write+0x45/0x60 kernfs_fop_write+0x135/0x1c0 __vfs_write+0x37/0x160 vfs_write+0xcd/0x1d0 SyS_write+0x58/0xc0 do_syscall_64+0x8f/0x710 return_from_SYSCALL_64+0x0/0x7a other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(cpu_hotplug.lock); lock(all_q_mutex); lock(cpu_hotplug.lock); lock(all_q_mutex); *** DEADLOCK *** 8 locks held by step_after_susp/2640: #0: (sb_writers#6){.+.+.+}, at: [] vfs_write+0x1ad/0x1d0 #1: (&of->mutex){+.+.+.}, at: [] kernfs_fop_write+0x101/0x1c0 #2: (s_active#166){.+.+.+}, at: [] kernfs_fop_write+0x109/0x1c0 #3: (pm_mutex){+.+...}, at: [] pm_suspend+0x21d/0x490 #4: (acpi_scan_lock){+.+.+.}, at: [] acpi_scan_lock_acquire+0x17/0x20 #5: (cpu_add_remove_lock){+.+.+.}, at: [] freeze_secondary_cpus+0x27/0x390 #6: (cpu_hotplug.dep_map){++++++}, at: [] cpu_hotplug_begin+0x5/0xe0 #7: (cpu_hotplug.lock){+.+.+.}, at: [] cpu_hotplug_begin+0x7f/0xe0 stack backtrace: CPU: 3 PID: 2640 Comm: step_after_susp Not tainted 4.11.0+ #17 Hardware name: Dell Inc. OptiPlex 7040/0JCTF8, BIOS 1.4.9 09/12/2016 Call Trace: dump_stack+0x99/0xce print_circular_bug+0x1fa/0x270 __lock_acquire+0x189a/0x18a0 lock_acquire+0x11c/0x230 ? lock_acquire+0x11c/0x230 ? blk_mq_queue_reinit_work+0x18/0x110 ? blk_mq_queue_reinit_work+0x18/0x110 __mutex_lock+0x92/0x990 ? blk_mq_queue_reinit_work+0x18/0x110 ? kmem_cache_free+0x2cb/0x330 ? anon_transport_class_unregister+0x20/0x20 ? blk_mq_queue_reinit_work+0x110/0x110 mutex_lock_nested+0x1b/0x20 ? mutex_lock_nested+0x1b/0x20 blk_mq_queue_reinit_work+0x18/0x110 blk_mq_queue_reinit_dead+0x1c/0x20 cpuhp_invoke_callback+0x1f2/0x810 ? __flow_cache_shrink+0x160/0x160 cpuhp_down_callbacks+0x42/0x80 _cpu_down+0xb2/0xe0 freeze_secondary_cpus+0xb6/0x390 suspend_devices_and_enter+0x3b3/0xa40 ? rcu_read_lock_sched_held+0x79/0x80 pm_suspend+0x129/0x490 state_store+0x82/0xf0 kobj_attr_store+0xf/0x20 sysfs_kf_write+0x45/0x60 kernfs_fop_write+0x135/0x1c0 __vfs_write+0x37/0x160 ? rcu_read_lock_sched_held+0x79/0x80 ? rcu_sync_lockdep_assert+0x2f/0x60 ? __sb_start_write+0xd9/0x1c0 ? vfs_write+0x1ad/0x1d0 vfs_write+0xcd/0x1d0 SyS_write+0x58/0xc0 ? rcu_read_lock_sched_held+0x79/0x80 do_syscall_64+0x8f/0x710 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL64_slow_path+0x25/0x25 The cpu hotplug path will hold cpu_hotplug.lock and then reinit all exiting queues for blk mq w/ all_q_mutex, however, blk_mq_init_allocated_queue() will contend these two locks in the inversion order. This is due to commit eabe06595d62 (blk/mq: Cure cpu hotplug lock inversion), it fixes a cpu hotplug lock inversion issue because of hotplug rework, however the hotplug rework is still work-in-progress and lives in a -tip branch and mainline cannot yet trigger that splat. The commit breaks the linus's tree in the merge window, so this patch reverts the lock order and avoids to splat linus's tree. Cc: Jens Axboe Cc: Peter Zijlstra (Intel) Cc: Thomas Gleixner Signed-off-by: Wanpeng Li Signed-off-by: Jens Axboe Cc: Thierry Escande Signed-off-by: Greg Kroah-Hartman --- block/blk-mq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2019,15 +2019,15 @@ struct request_queue *blk_mq_init_alloca blk_mq_init_cpu_queues(q, set->nr_hw_queues); - mutex_lock(&all_q_mutex); get_online_cpus(); + mutex_lock(&all_q_mutex); list_add_tail(&q->all_q_node, &all_q_list); blk_mq_add_queue_tag_set(set, q); blk_mq_map_swqueue(q, cpu_online_mask); - put_online_cpus(); mutex_unlock(&all_q_mutex); + put_online_cpus(); return q;