Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751445AbaACJZg (ORCPT ); Fri, 3 Jan 2014 04:25:36 -0500 Received: from canardo.mork.no ([148.122.252.1]:60579 "EHLO canardo.mork.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751014AbaACJZb convert rfc822-to-8bit (ORCPT ); Fri, 3 Jan 2014 04:25:31 -0500 From: =?utf-8?Q?Bj=C3=B8rn_Mork?= To: "Rafael J. Wysocki" Cc: cpufreq@vger.kernel.org, Viresh Kumar , linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] cpufreq: try to resume policies which failed on last resume Organization: m References: <5562479.pVWRuDL0y6@vostro.rjw.lan> <87zjne7f75.fsf@nemi.mork.no> <2302938.b8tymqrMEz@vostro.rjw.lan> Date: Fri, 03 Jan 2014 10:24:45 +0100 In-Reply-To: <2302938.b8tymqrMEz@vostro.rjw.lan> (Rafael J. Wysocki's message of "Fri, 03 Jan 2014 01:40:42 +0100") Message-ID: <878uuxquxu.fsf@nemi.mork.no> User-Agent: Gnus/5.11002 (No Gnus v0.20) Emacs/23.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11047 Lines: 249 "Rafael J. Wysocki" writes: > OK, thanks! Well, this is somewhat worrisome. > > Could you also check the linux-pm.git/fixes branch that contains all patches > I'm planning to push for 3.13-rc7 shortly? It's definitely still there. But I'm less sure about the exact trigger. I have now booted and suspend that branch a few times trying to figure out which actions it depends on. It seems to depend on a modified scaling_max_freq (or any other attribute, I guess - haven't testing anything else) in combination with suspend. But the order doesn't necessarily matter. I can sometimes set off the warning by writing to sysfs after resuming, and sometimes at suspend time if the value is already modified. The results are not consistent though. Sometimes there is no warning at all. But what's 100% sure is that I still can trigger this with the linux-pm.git/fixes branch. For example, triggered by echo 800000 >/sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq after resume (I believe the taint is due to a _request_firmware failure for one of the bluetooth devices): ====================================================== [ INFO: possible circular locking dependency detected ] 3.13.0-rc6+ #183 Tainted: G W ------------------------------------------------------- bash/4897 is trying to acquire lock: (cpu_hotplug.lock){+.+.+.}, at: [] get_online_cpus+0x3c/0x50 but task is already holding lock: (s_active#273){++++.+}, at: [] sysfs_write_file+0xd6/0x18b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (s_active#273){++++.+}: [] lock_acquire+0xfb/0x144 [] sysfs_deactivate+0xa5/0x108 [] sysfs_addrm_finish+0x28/0x46 [] sysfs_remove+0x2a/0x31 [] sysfs_remove_dir+0x66/0x6b [] kobject_del+0x18/0x42 [] kobject_cleanup+0xe1/0x14f [] kobject_put+0x45/0x49 [] cpufreq_policy_put_kobj+0x37/0x83 [] __cpufreq_add_dev.isra.18+0x75e/0x78c [] cpufreq_cpu_callback+0x53/0x88 [] notifier_call_chain+0x67/0x92 [] __raw_notifier_call_chain+0x9/0xb [] __cpu_notify+0x1b/0x32 [] cpu_notify+0xe/0x10 [] _cpu_up+0xf1/0x124 [] enable_nonboot_cpus+0x52/0xbf [] hibernation_snapshot+0x1be/0x2ed [] snapshot_ioctl+0x1e5/0x431 [] do_vfs_ioctl+0x45e/0x4a8 [] SyS_ioctl+0x52/0x82 [] system_call_fastpath+0x16/0x1b -> #0 (cpu_hotplug.lock){+.+.+.}: [] __lock_acquire+0xae3/0xe68 [] lock_acquire+0xfb/0x144 [] mutex_lock_nested+0x6c/0x397 [] get_online_cpus+0x3c/0x50 [] store+0x20/0xad [] sysfs_write_file+0x138/0x18b [] vfs_write+0x9c/0x102 [] SyS_write+0x50/0x85 [] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(s_active#273); lock(cpu_hotplug.lock); lock(s_active#273); lock(cpu_hotplug.lock); *** DEADLOCK *** 3 locks held by bash/4897: #0: (sb_writers#5){.+.+.+}, at: [] file_start_write+0x27/0x29 #1: (&of->mutex){+.+.+.}, at: [] sysfs_write_file+0xce/0x18b #2: (s_active#273){++++.+}, at: [] sysfs_write_file+0xd6/0x18b stack backtrace: CPU: 1 PID: 4897 Comm: bash Tainted: G W 3.13.0-rc6+ #183 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 ffffffff81ca83b0 ffff88021fa65c08 ffffffff81399cac 0000000000000006 ffffffff81ca83b0 ffff88021fa65c58 ffffffff81397dc5 0000000000000002 ffff8800b1af2810 0000000000000003 ffff8800b1af2ef0 0000000000000003 Call Trace: [] dump_stack+0x4e/0x68 [] print_circular_bug+0x1f8/0x209 [] __lock_acquire+0xae3/0xe68 [] lock_acquire+0xfb/0x144 [] ? get_online_cpus+0x3c/0x50 [] mutex_lock_nested+0x6c/0x397 [] ? get_online_cpus+0x3c/0x50 [] ? __lock_is_held+0x32/0x54 [] ? get_online_cpus+0x3c/0x50 [] get_online_cpus+0x3c/0x50 [] store+0x20/0xad [] sysfs_write_file+0x138/0x18b [] vfs_write+0x9c/0x102 [] SyS_write+0x50/0x85 [] system_call_fastpath+0x16/0x1b Or by s2disk (not the same boot - this time without the taint): ====================================================== [ INFO: possible circular locking dependency detected ] 3.13.0-rc6+ #183 Not tainted ------------------------------------------------------- s2disk/5123 is trying to acquire lock: (s_active#171){++++.+}, at: [] sysfs_addrm_finish+0x28/0x46 but task is already holding lock: (cpu_hotplug.lock){+.+.+.}, at: [] cpu_hotplug_begin+0x28/0x50 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (cpu_hotplug.lock){+.+.+.}: [] lock_acquire+0xfb/0x144 [] mutex_lock_nested+0x6c/0x397 [] get_online_cpus+0x3c/0x50 [] store+0x20/0xad [] sysfs_write_file+0x138/0x18b [] vfs_write+0x9c/0x102 [] SyS_write+0x50/0x85 [] system_call_fastpath+0x16/0x1b -> #0 (s_active#171){++++.+}: [] __lock_acquire+0xae3/0xe68 [] lock_acquire+0xfb/0x144 [] sysfs_deactivate+0xa5/0x108 [] sysfs_addrm_finish+0x28/0x46 [] sysfs_remove+0x2a/0x31 [] sysfs_remove_dir+0x66/0x6b [] kobject_del+0x18/0x42 [] kobject_cleanup+0xe1/0x14f [] kobject_put+0x45/0x49 [] cpufreq_policy_put_kobj+0x37/0x83 [] __cpufreq_add_dev.isra.18+0x75e/0x78c [] cpufreq_cpu_callback+0x53/0x88 [] notifier_call_chain+0x67/0x92 [] __raw_notifier_call_chain+0x9/0xb [] __cpu_notify+0x1b/0x32 [] cpu_notify+0xe/0x10 [] _cpu_up+0xf1/0x124 [] enable_nonboot_cpus+0x52/0xbf [] hibernation_snapshot+0x1be/0x2ed [] snapshot_ioctl+0x1e5/0x431 [] do_vfs_ioctl+0x45e/0x4a8 [] SyS_ioctl+0x52/0x82 [] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(cpu_hotplug.lock); lock(s_active#171); lock(cpu_hotplug.lock); lock(s_active#171); *** DEADLOCK *** 7 locks held by s2disk/5123: #0: (pm_mutex){+.+.+.}, at: [] snapshot_ioctl+0x4b/0x431 #1: (device_hotplug_lock){+.+.+.}, at: [] lock_device_hotplug+0x12/0x14 #2: (acpi_scan_lock){+.+.+.}, at: [] acpi_scan_lock_acquire+0x12/0x14 #3: (console_lock){+.+.+.}, at: [] suspend_console+0x20/0x38 #4: (cpu_add_remove_lock){+.+.+.}, at: [] cpu_maps_update_begin+0x12/0x14 #5: (cpu_hotplug.lock){+.+.+.}, at: [] cpu_hotplug_begin+0x28/0x50 #6: (cpufreq_rwsem){.+.+.+}, at: [] __cpufreq_add_dev.isra.18+0x7f/0x78c stack backtrace: CPU: 0 PID: 5123 Comm: s2disk Not tainted 3.13.0-rc6+ #183 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 ffffffff81d3de70 ffff8800b10718f8 ffffffff81399cac 0000000000003a93 ffffffff81d3de70 ffff8800b1071948 ffffffff81397dc5 ffff8800b1071928 ffff8800b10c8bd0 0000000000000006 ffff8800b10c9358 0000000000000006 Call Trace: [] dump_stack+0x4e/0x68 [] print_circular_bug+0x1f8/0x209 [] __lock_acquire+0xae3/0xe68 [] ? debug_check_no_locks_freed+0x12c/0x143 [] lock_acquire+0xfb/0x144 [] ? sysfs_addrm_finish+0x28/0x46 [] ? lockdep_init_map+0x14e/0x160 [] sysfs_deactivate+0xa5/0x108 [] ? sysfs_addrm_finish+0x28/0x46 [] sysfs_addrm_finish+0x28/0x46 [] sysfs_remove+0x2a/0x31 [] sysfs_remove_dir+0x66/0x6b [] kobject_del+0x18/0x42 [] kobject_cleanup+0xe1/0x14f [] kobject_put+0x45/0x49 [] cpufreq_policy_put_kobj+0x37/0x83 [] __cpufreq_add_dev.isra.18+0x75e/0x78c [] ? __lock_is_held+0x32/0x54 [] cpufreq_cpu_callback+0x53/0x88 [] notifier_call_chain+0x67/0x92 [] __raw_notifier_call_chain+0x9/0xb [] __cpu_notify+0x1b/0x32 [] cpu_notify+0xe/0x10 [] _cpu_up+0xf1/0x124 [] enable_nonboot_cpus+0x52/0xbf [] hibernation_snapshot+0x1be/0x2ed [] snapshot_ioctl+0x1e5/0x431 [] do_vfs_ioctl+0x45e/0x4a8 [] ? fcheck_files+0xa1/0xe4 [] ? fget_light+0x33/0x9a [] SyS_ioctl+0x52/0x82 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] system_call_fastpath+0x16/0x1b CPU1 is up ACPI: Waking up from system sleep state S4 PM: noirq thaw of devices complete after 1.722 msecs PM: early thaw of devices complete after 1.180 msecs Bjørn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/