Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756744AbZDENqZ (ORCPT ); Sun, 5 Apr 2009 09:46:25 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753284AbZDENqQ (ORCPT ); Sun, 5 Apr 2009 09:46:16 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:54948 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752234AbZDENqP (ORCPT ); Sun, 5 Apr 2009 09:46:15 -0400 Date: Sun, 5 Apr 2009 15:44:54 +0200 From: Ingo Molnar To: "Rafael J. Wysocki" , Peter Zijlstra , Rusty Russell Cc: Ming Lei , Andrew Morton , Linux-pm mailing list , Linux Kernel List , Gautham R Shenoy Subject: Re: pm-hibernate : possible circular locking dependency detected Message-ID: <20090405134454.GB25250@elte.hu> References: <200904051212.53353.rjw@sisk.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200904051212.53353.rjw@sisk.pl> User-Agent: Mutt/1.5.18 (2008-05-17) X-ELTE-VirusStatus: clean X-ELTE-SpamScore: -1.5 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-1.5 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.3 -1.5 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14425 Lines: 271 * Rafael J. Wysocki wrote: > On Sunday 05 April 2009, Ming Lei wrote: > > kernel version : one simple usb-serial patch against commit > > 6bb597507f9839b13498781e481f5458aea33620. > > > > Thanks. > > Hmm, CPU hotplug again, it seems. > > I'm not sure who's the maintainer at the moment. Andrew, is that > Gautham? CPU hotplug tends to land on the scheduler people's desk normally. But i'm not sure that's the real thing here - key appears to be this work_on_cpu() worklet by the cpufreq code: > > [ 2276.033134] -> #1 (work_on_cpu){+.+.+.}: > > [ 2276.033138] [] __lock_acquire+0x1402/0x178c > > [ 2276.033141] [] lock_acquire+0x93/0xbf > > [ 2276.033145] [] flush_work+0x60/0xeb > > [ 2276.033148] [] work_on_cpu+0x73/0x84 > > [ 2276.033151] [] get_cur_val+0xb6/0xe3 [acpi_cpufreq] > > [ 2276.033157] [] > > get_cur_freq_on_cpu+0x74/0xc5 [acpi_cpufreq] > > [ 2276.033161] [] > > acpi_cpufreq_cpu_init+0x381/0x4c4 [acpi_cpufreq] > > [ 2276.033166] [] cpufreq_add_dev+0x1bb/0x676 Connecting things the wrong way around. Have not looked deeply though - Peter, Rusty, what's your take? (below is the message again) Ingo > > [ 2276.031486] PM: Saving platform NVS memory > > [ 2276.032410] Disabling non-boot CPUs ... > > [ 2276.032879] > > [ 2276.032880] ======================================================= > > [ 2276.032882] [ INFO: possible circular locking dependency detected ] > > [ 2276.032885] 2.6.29-08003-g8b58977 #84 > > [ 2276.032886] ------------------------------------------------------- > > [ 2276.032888] pm-hibernate/5061 is trying to acquire lock: > > [ 2276.032891] (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: > > [] lock_policy_rwsem_write+0x48/0x79 > > [ 2276.032900] > > [ 2276.032900] but task is already holding lock: > > [ 2276.032902] (&cpu_hotplug.lock){+.+.+.}, at: [] > > cpu_hotplug_begin+0x22/0x50 > > [ 2276.032908] > > [ 2276.032909] which lock already depends on the new lock. > > [ 2276.032910] > > [ 2276.032911] > > [ 2276.032912] the existing dependency chain (in reverse order) is: > > [ 2276.032913] > > [ 2276.032914] -> #6 (&cpu_hotplug.lock){+.+.+.}: > > [ 2276.032918] [] __lock_acquire+0x1402/0x178c > > [ 2276.032922] [] lock_acquire+0x93/0xbf > > [ 2276.032926] [] mutex_lock_nested+0x6a/0x362 > > [ 2276.032930] [] cpu_hotplug_begin+0x22/0x50 > > [ 2276.032933] [] _cpu_up+0x58/0x12f > > [ 2276.032937] [] cpu_up+0x63/0x78 > > [ 2276.032940] [] kernel_init+0xbd/0x189 > > [ 2276.032944] [] child_rip+0xa/0x20 > > [ 2276.032948] [] 0xffffffffffffffff > > [ 2276.032960] > > [ 2276.032961] -> #5 (cpu_add_remove_lock){+.+.+.}: > > [ 2276.032964] [] __lock_acquire+0x1402/0x178c > > [ 2276.032968] [] lock_acquire+0x93/0xbf > > [ 2276.032971] [] mutex_lock_nested+0x6a/0x362 > > [ 2276.032974] [] cpu_maps_update_begin+0x12/0x14 > > [ 2276.032978] [] __create_workqueue_key+0xdc/0x1a2 > > [ 2276.032982] [] stop_machine_create+0x3a/0x9b > > [ 2276.032987] [] load_module+0x58/0x18c8 > > [ 2276.032990] [] sys_init_module+0x57/0x1cd > > [ 2276.032993] [] system_call_fastpath+0x16/0x1b > > [ 2276.032998] [] 0xffffffffffffffff > > [ 2276.033001] > > [ 2276.033001] -> #4 (setup_lock){+.+.+.}: > > [ 2276.033005] [] __lock_acquire+0x1402/0x178c > > [ 2276.033008] [] lock_acquire+0x93/0xbf > > [ 2276.033012] [] mutex_lock_nested+0x6a/0x362 > > [ 2276.033015] [] stop_machine_create+0x12/0x9b > > [ 2276.033018] [] disable_nonboot_cpus+0x16/0x120 > > [ 2276.033022] [] hibernation_snapshot+0x99/0x1ac > > [ 2276.033025] [] hibernate+0xab/0x1cf > > [ 2276.033028] [] state_store+0x59/0xd7 > > [ 2276.033032] [] kobj_attr_store+0x17/0x19 > > [ 2276.033036] [] sysfs_write_file+0xe4/0x119 > > [ 2276.033040] [] vfs_write+0xae/0x137 > > [ 2276.033044] [] sys_write+0x47/0x6e > > [ 2276.033047] [] system_call_fastpath+0x16/0x1b > > [ 2276.033050] [] 0xffffffffffffffff > > [ 2276.033053] > > [ 2276.033054] -> #3 (dpm_list_mtx){+.+.+.}: > > [ 2276.033057] [] __lock_acquire+0x1402/0x178c > > [ 2276.033061] [] lock_acquire+0x93/0xbf > > [ 2276.033065] [] mutex_lock_nested+0x6a/0x362 > > [ 2276.033068] [] device_pm_add+0x46/0xed > > [ 2276.033073] [] device_add+0x488/0x61a > > [ 2276.033077] [] device_register+0x19/0x1d > > [ 2276.033080] [] pcie_port_device_register+0x450/0x4b6 > > [ 2276.033085] [] pcie_portdrv_probe+0x69/0x82 > > [ 2276.033089] [] local_pci_probe+0x12/0x16 > > [ 2276.033093] [] do_work_for_cpu+0x13/0x1b > > [ 2276.033097] [] worker_thread+0x1b2/0x2c9 > > [ 2276.033100] [] kthread+0x49/0x76 > > [ 2276.033104] [] child_rip+0xa/0x20 > > [ 2276.033107] [] 0xffffffffffffffff > > [ 2276.033111] > > [ 2276.033111] -> #2 (&wfc.work){+.+.+.}: > > [ 2276.033114] [] __lock_acquire+0x1402/0x178c > > [ 2276.033118] [] lock_acquire+0x93/0xbf > > [ 2276.033122] [] worker_thread+0x1ac/0x2c9 > > [ 2276.033125] [] kthread+0x49/0x76 > > [ 2276.033128] [] child_rip+0xa/0x20 > > [ 2276.033131] [] 0xffffffffffffffff > > [ 2276.033134] > > [ 2276.033134] -> #1 (work_on_cpu){+.+.+.}: > > [ 2276.033138] [] __lock_acquire+0x1402/0x178c > > [ 2276.033141] [] lock_acquire+0x93/0xbf > > [ 2276.033145] [] flush_work+0x60/0xeb > > [ 2276.033148] [] work_on_cpu+0x73/0x84 > > [ 2276.033151] [] get_cur_val+0xb6/0xe3 [acpi_cpufreq] > > [ 2276.033157] [] > > get_cur_freq_on_cpu+0x74/0xc5 [acpi_cpufreq] > > [ 2276.033161] [] > > acpi_cpufreq_cpu_init+0x381/0x4c4 [acpi_cpufreq] > > [ 2276.033166] [] cpufreq_add_dev+0x1bb/0x676 > > [ 2276.033169] [] sysdev_driver_register+0xc6/0x120 > > [ 2276.033173] [] cpufreq_register_driver+0xbc/0x1ae > > [ 2276.033177] [] 0xffffffffa00c60c5 > > [ 2276.033180] [] do_one_initcall+0x70/0x185 > > [ 2276.033183] [] sys_init_module+0xab/0x1cd > > [ 2276.033187] [] system_call_fastpath+0x16/0x1b > > [ 2276.033190] [] 0xffffffffffffffff > > [ 2276.033197] > > [ 2276.033197] -> #0 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}: > > [ 2276.033201] [] __lock_acquire+0x111f/0x178c > > [ 2276.033204] [] lock_acquire+0x93/0xbf > > [ 2276.033208] [] down_write+0x43/0x77 > > [ 2276.033211] [] lock_policy_rwsem_write+0x48/0x79 > > [ 2276.033215] [] cpufreq_cpu_callback+0x52/0x7a > > [ 2276.033218] [] notifier_call_chain+0x33/0x5b > > [ 2276.033222] [] __raw_notifier_call_chain+0x9/0xb > > [ 2276.033226] [] _cpu_down+0x86/0x298 > > [ 2276.033229] [] disable_nonboot_cpus+0x78/0x120 > > [ 2276.033233] [] hibernation_snapshot+0x99/0x1ac > > [ 2276.033236] [] hibernate+0xab/0x1cf > > [ 2276.033239] [] state_store+0x59/0xd7 > > [ 2276.033242] [] kobj_attr_store+0x17/0x19 > > [ 2276.033246] [] sysfs_write_file+0xe4/0x119 > > [ 2276.033249] [] vfs_write+0xae/0x137 > > [ 2276.033252] [] sys_write+0x47/0x6e > > [ 2276.033255] [] system_call_fastpath+0x16/0x1b > > [ 2276.033259] [] 0xffffffffffffffff > > [ 2276.033262] > > [ 2276.033262] other info that might help us debug this: > > [ 2276.033263] > > [ 2276.033265] 5 locks held by pm-hibernate/5061: > > [ 2276.033267] #0: (&buffer->mutex){+.+.+.}, at: > > [] sysfs_write_file+0x38/0x119 > > [ 2276.033273] #1: (pm_mutex){+.+.+.}, at: [] > > hibernate+0x17/0x1cf > > [ 2276.033278] #2: (dpm_list_mtx){+.+.+.}, at: [] > > device_pm_lock+0x12/0x14 > > [ 2276.033284] #3: (cpu_add_remove_lock){+.+.+.}, at: > > [] disable_nonboot_cpus+0x33/0x120 > > [ 2276.033289] #4: (&cpu_hotplug.lock){+.+.+.}, at: > > [] cpu_hotplug_begin+0x22/0x50 > > [ 2276.033294] > > [ 2276.033295] stack backtrace: > > [ 2276.033297] Pid: 5061, comm: pm-hibernate Not tainted > > 2.6.29-08003-g8b58977 #84 > > [ 2276.033300] Call Trace: > > [ 2276.033304] [] print_circular_bug_tail+0xc5/0xd0 > > [ 2276.033307] [] __lock_acquire+0x111f/0x178c > > [ 2276.033312] [] ? sub_preempt_count+0xdc/0xf1 > > [ 2276.033315] [] ? trace_hardirqs_on+0xd/0xf > > [ 2276.033319] [] lock_acquire+0x93/0xbf > > [ 2276.033322] [] ? lock_policy_rwsem_write+0x48/0x79 > > [ 2276.033326] [] down_write+0x43/0x77 > > [ 2276.033329] [] ? lock_policy_rwsem_write+0x48/0x79 > > [ 2276.033333] [] lock_policy_rwsem_write+0x48/0x79 > > [ 2276.033337] [] cpufreq_cpu_callback+0x52/0x7a > > [ 2276.033340] [] notifier_call_chain+0x33/0x5b > > [ 2276.033344] [] __raw_notifier_call_chain+0x9/0xb > > [ 2276.033347] [] _cpu_down+0x86/0x298 > > [ 2276.033350] [] disable_nonboot_cpus+0x78/0x120 > > [ 2276.033350] [] hibernation_snapshot+0x99/0x1ac > > [ 2276.033350] [] hibernate+0xab/0x1cf > > [ 2276.033350] [] state_store+0x59/0xd7 > > [ 2276.033350] [] kobj_attr_store+0x17/0x19 > > [ 2276.033350] [] sysfs_write_file+0xe4/0x119 > > [ 2276.033350] [] vfs_write+0xae/0x137 > > [ 2276.033350] [] sys_write+0x47/0x6e > > [ 2276.033350] [] system_call_fastpath+0x16/0x1b > > [ 2276.034478] Broke affinity for irq 21 > > [ 2276.034478] Broke affinity for irq 22 > > [ 2276.034478] Broke affinity for irq 29 > > [ 2276.035812] CPU 1 is now offline > > [ 2276.035814] lockdep: fixing up alternatives. > > [ 2276.035816] SMP alternatives: switching to UP code > > [ 2276.158329] CPU0 attaching NULL sched-domain. > > [ 2276.158334] CPU1 attaching NULL sched-domain. > > [ 2276.158361] CPU0 attaching NULL sched-domain. > > [ 2276.158841] CPU1 is down > > [ 2276.158867] Checking wake-up interrupts > > [ 2276.158883] Suspending System Devices > > [ 2276.158885] Suspending type 'oprofile': > > [ 2276.158886] oprofile0 > > [ 2276.158888] Suspending type 'edac': > > [ 2276.158889] Suspending type 'irqrouter': > > [ 2276.158891] irqrouter0 > > [ 2276.158892] Suspending type 'clocksource': > > [ 2276.158894] clocksource0 > > [ 2276.158895] Suspending type 'timekeeping': > > [ 2276.158897] timekeeping0 > > [ 2276.158929] Extended CMOS year: 2000 > > [ 2276.158937] Suspending type 'ioapic': > > [ 2276.158938] ioapic0 > > [ 2276.159007] Suspending type 'lapic': > > [ 2276.159007] lapic0 > > [ 2276.159007] Suspending type 'machinecheck': > > [ 2276.159007] machinecheck0 > > [ 2276.159007] Suspending type 'i8237': > > [ 2276.159007] i82370 > > [ 2276.159007] Suspending type 'i8259': > > [ 2276.159007] i82590 > > [ 2276.159007] Suspending type 'node': > > [ 2276.159007] node0 > > [ 2276.159007] Suspending type 'cpu': > > [ 2276.159007] cpu0 > > [ 2276.159007] cpu1 > > [ 2276.159007] PM: Creating hibernation image: > > [ 2276.159007] PM: Need to copy 112986 pages > > [ 2276.159007] PM: Restoring platform NVS memory > > [ 2276.159007] Resuming System Devices > > [ 2276.159007] Resuming type 'cpu': > > [ 2276.159007] cpu0 > > [ 2276.159007] cpu1 > > [ 2276.159007] Resuming type 'node': > > [ 2276.159007] node0 > > [ 2276.159007] Resuming type 'i8259': > > [ 2276.159007] i82590 > > [ 2276.159007] Resuming type 'i8237': > > [ 2276.159007] i82370 > > [ 2276.159007] Resuming type 'machinecheck': > > [ 2276.159007] machinecheck0 > > [ 2276.159007] CPU0: Thermal LVT vector (0xfa) already installed > > [ 2276.159007] Resuming type 'lapic': > > [ 2276.159007] lapic0 > > [ 2276.159007] Resuming type 'ioapic': > > [ 2276.159007] ioapic0 > > [ 2276.159007] Resuming type 'timekeeping': > > [ 2276.159007] timekeeping0 > > [ 2276.159007] Extended CMOS year: 2000 > > [ 2276.159007] Resuming type 'clocksource': > > [ 2276.159007] clocksource0 > > [ 2276.159007] Resuming type 'irqrouter': > > [ 2276.159007] irqrouter0 > > [ 2276.159007] Resuming type 'edac': > > [ 2276.159007] Resuming type 'oprofile': > > [ 2276.159007] oprofile0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/