Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756277AbZKFJfv (ORCPT ); Fri, 6 Nov 2009 04:35:51 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755221AbZKFJfv (ORCPT ); Fri, 6 Nov 2009 04:35:51 -0500 Received: from arroyo.ext.ti.com ([192.94.94.40]:41173 "EHLO arroyo.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755154AbZKFJft convert rfc822-to-8bit (ORCPT ); Fri, 6 Nov 2009 04:35:49 -0500 From: "Dasgupta, Romit" To: "Rafael J. Wysocki" , "pavel@ucw.cz" CC: "linux-omap@vger.kernel.org" , "linux-pm@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" Date: Fri, 6 Nov 2009 15:05:01 +0530 Subject: [PATCH 0/1] PM: Thaws refrigerated and to be exited kernel threads Thread-Topic: [PATCH 0/1] PM: Thaws refrigerated and to be exited kernel threads Thread-Index: AcpexGmiGuUkesr/TZCmZ5EDusiz4Q== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3907 Lines: 64 Hi, The following patch overcomes the issue when an active thread invokes kthread_stop on a refrigerated kernel thread. The active thread would block until the exiting kernel thread is cleaned up. If the exiting thread is in refrigerator it never cleans up and the caller blocks. I found the issue while trying out the following: 1) suspend and resume the system with an mmc card. 2) after resume mount a filesystem on the mmc card. 3) unmount the same filesystem. 4) One can see an active bdi thread in the system for the FS. 5) Attempted suspend on the system. This resulted in hang. Here was the dump from khungd # echo mem > /sys/power/state PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. mmc1: card 0001 removed mmc0: card 25b7 removed INFO: task sh:388 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. sh D c027e86c 0 388 1 0x00000000 [] (schedule+0x2e0/0x36c) from [] (schedule_timeout+0x18/0x1ec) [] (schedule_timeout+0x18/0x1ec) from [] (wait_for_common+0xe0/0x198) [] (wait_for_common+0xe0/0x198) from [] (kthread_stop+0x44/0x78) [] (kthread_stop+0x44/0x78) from [] (bdi_unregister+0x64/0xa4) [] (bdi_unregister+0x64/0xa4) from [] (unlink_gendisk+0x20/0x3c) [] (unlink_gendisk+0x20/0x3c) from [] (del_gendisk+0x84/0xb4) [] (del_gendisk+0x84/0xb4) from [] (mmc_blk_remove+0x24/0x44) [] (mmc_blk_remove+0x24/0x44) from [] (mmc_bus_remove+0x18/0x20) [] (mmc_bus_remove+0x18/0x20) from [] (__device_release_driver+0x64/0xa4) [] (__device_release_driver+0x64/0xa4) from [] (device_release_driver+0x1c/0x28) [] (device_release_driver+0x1c/0x28) from [] (bus_remove_device+0x7c/0x90) [] (bus_remove_device+0x7c/0x90) from [] (device_del+0x110/0x160) [] (device_del+0x110/0x160) from [] (mmc_remove_card+0x50/0x64) [] (mmc_remove_card+0x50/0x64) from [] (mmc_sd_remove+0x24/0x30) [] (mmc_sd_remove+0x24/0x30) from [] (mmc_suspend_host+0x110/0x1a8) [] (mmc_suspend_host+0x110/0x1a8) from [] (omap_hsmmc_suspend+0x74/0x104) [] (omap_hsmmc_suspend+0x74/0x104) from [] (platform_pm_suspend+0x50/0x5c) [] (platform_pm_suspend+0x50/0x5c) from [] (pm_op+0x30/0x74) [] (pm_op+0x30/0x74) from [] (dpm_suspend_start+0x3b4/0x518) [] (dpm_suspend_start+0x3b4/0x518) from [] (suspend_devices_and_enter+0x3c/0x1c4) [] (suspend_devices_and_enter+0x3c/0x1c4) from [] (enter_state+0xe0/0x138) [] (enter_state+0xe0/0x138) from [] (state_store+0x94/0xbc) [] (state_store+0x94/0xbc) from [] (kobj_attr_store+0x18/0x1c) [] (kobj_attr_store+0x18/0x1c) from [] (sysfs_write_file+0x108/0x13c) [] (sysfs_write_file+0x108/0x13c) from [] (vfs_write+0xac/0x154) [] (vfs_write+0xac/0x154) from [] (sys_write+0x3c/0x68) [] (sys_write+0x3c/0x68) from [] (ret_fast_syscall+0x0/0x2c) Before the hang I did a ps and found the following the following bdi thread active 474 root SW [flush-179:0] After applying the patch (patch in next email) I saw successful suspend and the offensive thread was cleaned up properly. Thanks, -Romit -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/