Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750969AbaDKEK5 (ORCPT ); Fri, 11 Apr 2014 00:10:57 -0400 Received: from e23smtp06.au.ibm.com ([202.81.31.148]:60741 "EHLO e23smtp06.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbaDKEKz (ORCPT ); Fri, 11 Apr 2014 00:10:55 -0400 Message-ID: <1397189445.3649.14.camel@ThinkPad-T5421> Subject: [RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks From: Li Zhong To: Tejun Heo Cc: LKML , gregkh@linuxfoundation.org, rafael.j.wysocki@intel.com, toshi.kani@hp.com Date: Fri, 11 Apr 2014 12:10:45 +0800 In-Reply-To: <20140410133116.GB25308@htj.dyndns.org> References: <1397121514.25199.91.camel@ThinkPad-T5421.cn.ibm.com> <20140410133116.GB25308@htj.dyndns.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14041104-7014-0000-0000-000004ADE253 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I noticed following lockdep warning when trying acpi hot-remove cpus: [84154.204080] ====================================================== [84154.204080] [ INFO: possible circular locking dependency detected ] [84154.204080] 3.14.0-next-20140408+ #24 Tainted: G W [84154.204080] ------------------------------------------------------- [84154.204080] bash/777 is trying to acquire lock: [84154.204080] (cpu_add_remove_lock){+.+.+.}, at: [] cpu_maps_update_begin+0x17/0x20 [84154.213203] [84154.213203] but task is already holding lock: [84154.213203] (s_active#79){++++.+}, at: [] kernfs_fop_write+0xe4/0x190 [84154.213203] [84154.213203] which lock already depends on the new lock. [84154.213203] [84154.213203] [84154.213203] the existing dependency chain (in reverse order) is: [84154.213203] -> #3 (s_active#79){++++.+}: [84154.213203] [] lock_acquire+0x9b/0x1d0 [84154.213203] [] __kernfs_remove+0x250/0x310 [84154.213203] [] kernfs_remove_by_name_ns +0x50/0xc0 [84154.213203] [] sysfs_remove_file_ns +0x15/0x20 [84154.213203] [] device_remove_file+0x19/0x20 [84154.213203] [] device_remove_attrs+0x33/0x80 [84154.213203] [] device_del+0x127/0x1c0 [84154.213203] [] device_unregister+0x22/0x60 [84154.213203] [] unregister_cpu+0x1e/0x40 [84154.213203] [] arch_unregister_cpu+0x23/0x30 [84154.213203] [] acpi_processor_remove +0x8d/0xb2 [84154.213203] [] acpi_bus_trim+0x5a/0x8d [84154.213203] [] acpi_device_hotplug +0x1a8/0x3ec [84154.213203] [] acpi_hotplug_work_fn +0x1f/0x2b [84154.213203] [] process_one_work+0x1eb/0x6b0 [84154.213203] [] worker_thread+0x11b/0x370 [84154.213203] [] kthread+0xe4/0x100 [84154.213203] [] ret_from_fork+0x7c/0xb0 [84154.213203] -> #2 (cpu_hotplug.lock#2){+.+.+.}: [84154.213203] [] lock_acquire+0x9b/0x1d0 [84154.213203] [] mutex_lock_nested+0x50/0x3c0 [84154.213203] [] cpu_hotplug_begin+0x4f/0x80 [84154.213203] [] _cpu_up+0x3f/0x160 [84154.213203] [] cpu_up+0x69/0x80 [84154.213203] [] smp_init+0x60/0x8c [84154.213203] [] kernel_init_freeable +0x126/0x23b [84154.213203] [] kernel_init+0xe/0xf0 [84154.213203] [] ret_from_fork+0x7c/0xb0 [84154.213203] -> #1 (cpu_hotplug.lock){++++++}: [84154.213203] [] lock_acquire+0x9b/0x1d0 [84154.213203] [] cpu_hotplug_begin+0x41/0x80 [84154.213203] [] _cpu_up+0x3f/0x160 [84154.213203] [] cpu_up+0x69/0x80 [84154.213203] [] smp_init+0x60/0x8c [84154.213203] [] kernel_init_freeable +0x126/0x23b [84154.213203] [] kernel_init+0xe/0xf0 [84154.213203] [] ret_from_fork+0x7c/0xb0 [84154.213203] -> #0 (cpu_add_remove_lock){+.+.+.}: [84154.213203] [] __lock_acquire+0x1f2a/0x1f60 [84154.213203] [] lock_acquire+0x9b/0x1d0 [84154.213203] [] mutex_lock_nested+0x50/0x3c0 [84154.213203] [] cpu_maps_update_begin +0x17/0x20 [84154.213203] [] cpu_down+0x1d/0x50 [84154.213203] [] cpu_subsys_offline+0x14/0x20 [84154.213203] [] device_offline+0xad/0xd0 [84154.213203] [] online_store+0x42/0x80 [84154.213203] [] dev_attr_store+0x18/0x30 [84154.213203] [] sysfs_kf_write+0x49/0x60 [84154.213203] [] kernfs_fop_write+0x109/0x190 [84154.213203] [] vfs_write+0xbe/0x1c0 [84154.213203] [] SyS_write+0x52/0xb0 [84154.213203] [] tracesys+0xd0/0xd5 [84154.213203] [84154.213203] other info that might help us debug this: [84154.213203] [84154.213203] Chain exists of: cpu_add_remove_lock --> cpu_hotplug.lock#2 --> s_active#79 [84154.213203] Possible unsafe locking scenario: [84154.213203] [84154.213203] CPU0 CPU1 [84154.213203] ---- ---- [84154.213203] lock(s_active#79); [84154.213203] lock(cpu_hotplug.lock#2); [84154.213203] lock(s_active#79); [84154.213203] lock(cpu_add_remove_lock); [84154.213203] [84154.213203] *** DEADLOCK *** ............. The deadlock itself seems already fixed in commit 5e33bc41. As Tejun suggested, to avoid this lockdep warning, kernfs_break_active_protection() is used before online/offline callbacks to take the s_active lock out of the dependency chain during online/offline operations. Signed-off-by: Li Zhong --- drivers/base/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/base/core.c b/drivers/base/core.c index 0dd6528..2b9f68e 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -439,6 +439,7 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr, { bool val; int ret; + struct kernfs_node *kn; ret = strtobool(buf, &val); if (ret < 0) @@ -448,7 +449,15 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr, if (ret) return ret; + kn = kernfs_find_and_get(dev->kobj.sd, attr->attr.name); + if (WARN_ON_ONCE(!kn)) + goto out; + + kernfs_break_active_protection(kn); + ret = val ? device_online(dev) : device_offline(dev); + kernfs_unbreak_active_protection(kn); +out: unlock_device_hotplug(); return ret < 0 ? ret : count; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/