Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1764550AbYFZNcv (ORCPT ); Thu, 26 Jun 2008 09:32:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1761270AbYFZNc3 (ORCPT ); Thu, 26 Jun 2008 09:32:29 -0400 Received: from e28smtp03.in.ibm.com ([59.145.155.3]:56783 "EHLO e28esmtp03.in.ibm.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1760055AbYFZNc0 (ORCPT ); Thu, 26 Jun 2008 09:32:26 -0400 Message-ID: <48639A40.8090203@linux.vnet.ibm.com> Date: Thu, 26 Jun 2008 19:01:44 +0530 From: Nageswara R Sastry User-Agent: Thunderbird 2.0.0.14 (X11/20080505) MIME-Version: 1.0 To: Johannes Weiner CC: linux-kernel@vger.kernel.org, balbir@linux.vnet.ibm.com, ego@linux.vnet.ibm.com, svaidy@linux.vnet.ibm.com, davej@codemonkey.org.uk Subject: Re: [BUG] While changing the cpufreq governor, kernel hits a bug in workqueue.c References: <485F8028.1070302@linux.vnet.ibm.com> <87y74w41fp.fsf@skyscraper.fehenstaub.lan> <4860BB8E.2070505@linux.vnet.ibm.com> <87tzfh2t5l.fsf@skyscraper.fehenstaub.lan> <48638906.4090308@linux.vnet.ibm.com> In-Reply-To: <48638906.4090308@linux.vnet.ibm.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7984 Lines: 194 Nageswara R Sastry wrote: > Hi, > > Johannes Weiner wrote: > >> From: Johannes Weiner >> Subject: cpufreq: cancel self-rearming work synchroneuously >> >> The ondemand and conservative governor workers are self-rearming. >> Cancel them synchroneously to avoid nasty races. >> >> Reported-by: Nageswara R Sastry >> Signed-off-by: Johannes Weiner >> --- >> >> diff --git a/drivers/cpufreq/cpufreq_conservative.c >> b/drivers/cpufreq/cpufreq_conservative.c >> index 5d3a04b..78bac06 100644 >> --- a/drivers/cpufreq/cpufreq_conservative.c >> +++ b/drivers/cpufreq/cpufreq_conservative.c >> @@ -467,7 +467,7 @@ static inline void dbs_timer_init(void) >> >> static inline void dbs_timer_exit(void) >> { >> - cancel_delayed_work(&dbs_work); >> + cancel_delayed_work_sync(&dbs_work); >> return; >> } >> >> diff --git a/drivers/cpufreq/cpufreq_ondemand.c >> b/drivers/cpufreq/cpufreq_ondemand.c >> index d2af20d..1eb8c58 100644 >> --- a/drivers/cpufreq/cpufreq_ondemand.c >> +++ b/drivers/cpufreq/cpufreq_ondemand.c >> @@ -490,7 +490,7 @@ static inline void dbs_timer_init(struct >> cpu_dbs_info_s *dbs_info) >> static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info) >> { >> dbs_info->enable = 0; >> - cancel_delayed_work(&dbs_info->work); >> + cancel_delayed_work_sync(&dbs_info->work); >> } >> >> static int cpufreq_governor_dbs(struct cpufreq_policy *policy, > > Applied the above patch only and compiled the kernel and seeing an > Circular lock related issue at the time of booting. First I am checking > this and will let you the results by applying both the patches. > > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.25.7.cpufreq_patch #2 > ------------------------------------------------------- > S06cpuspeed/3493 is trying to acquire lock: > (&(&dbs_info->work)->work){--..}, at: [] > __cancel_work_timer+0x80/0x177 > > but task is already holding lock: > (dbs_mutex){--..}, at: [] cpufreq_governor_dbs+0x25e/0x2ed > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #2 (dbs_mutex){--..}: > [] add_lock_to_list+0x61/0x83 > [] __lock_acquire+0x953/0xb05 > [] cpufreq_governor_dbs+0x74/0x2ed > [] lock_acquire+0x5f/0x79 > [] cpufreq_governor_dbs+0x74/0x2ed > [] mutex_lock_nested+0xce/0x222 > [] cpufreq_governor_dbs+0x74/0x2ed > [] cpufreq_governor_dbs+0x74/0x2ed > [] cpufreq_governor_dbs+0x74/0x2ed > [] __cpufreq_governor+0x73/0xa6 > [] __cpufreq_set_policy+0x13b/0x19e > [] cpufreq_add_dev+0x3b4/0x4aa > [] handle_update+0x0/0x21 > [] sysdev_driver_register+0x48/0x9a > [] cpufreq_register_driver+0x9b/0x147 > [] kernel_init+0x130/0x26f > [] kernel_init+0x0/0x26f > [] kernel_init+0x0/0x26f > [] kernel_thread_helper+0x7/0x10 > [] 0xffffffff > > -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){----}: > [] __lock_acquire+0x953/0xb05 > [] lock_policy_rwsem_write+0x30/0x56 > [] save_stack_trace+0x1a/0x35 > [] lock_acquire+0x5f/0x79 > [] lock_policy_rwsem_write+0x30/0x56 > [] down_write+0x2b/0x44 > [] lock_policy_rwsem_write+0x30/0x56 > [] lock_policy_rwsem_write+0x30/0x56 > [] do_dbs_timer+0x40/0x24f > [] run_workqueue+0x81/0x187 > [] run_workqueue+0xbc/0x187 > [] run_workqueue+0x81/0x187 > [] do_dbs_timer+0x0/0x24f > [] worker_thread+0x0/0xbd > [] worker_thread+0xb3/0xbd > [] autoremove_wake_function+0x0/0x2d > [] kthread+0x38/0x5d > [] kthread+0x0/0x5d > [] kernel_thread_helper+0x7/0x10 > [] 0xffffffff > > -> #0 (&(&dbs_info->work)->work){--..}: > [] print_circular_bug_tail+0x2a/0x61 > [] __lock_acquire+0x878/0xb05 > [] lock_acquire+0x5f/0x79 > [] __cancel_work_timer+0x80/0x177 > [] __cancel_work_timer+0xab/0x177 > [] __cancel_work_timer+0x80/0x177 > [] mark_held_locks+0x39/0x53 > [] mutex_lock_nested+0x20f/0x222 > [] trace_hardirqs_on+0xe7/0x10e > [] mutex_lock_nested+0x21a/0x222 > [] cpufreq_governor_dbs+0x25e/0x2ed > [] cpufreq_governor_dbs+0x270/0x2ed > [] __cpufreq_governor+0x73/0xa6 > [] __cpufreq_set_policy+0x129/0x19e > [] store_scaling_governor+0x112/0x135 > [] handle_update+0x0/0x21 > [] atkbd_set_leds+0x9/0xcf > [] store_scaling_governor+0x0/0x135 > [] store+0x3c/0x54 > [] sysfs_write_file+0xa9/0xdd > [] sysfs_write_file+0x0/0xdd > [] vfs_write+0x83/0xf6 > [] sys_write+0x3c/0x63 > [] sysenter_past_esp+0x5f/0xa5 > [] 0xffffffff > > other info that might help us debug this: > > 3 locks held by S06cpuspeed/3493: > #0: (&buffer->mutex){--..}, at: [] sysfs_write_file+0x24/0xdd > #1: (&per_cpu(cpu_policy_rwsem, cpu)){----}, at: [] > lock_policy_rwsem_write+0x30/0x56 > #2: (dbs_mutex){--..}, at: [] cpufreq_governor_dbs+0x25e/0x2ed > > stack backtrace: > Pid: 3493, comm: S06cpuspeed Not tainted 2.6.25.7.cpufreq_patch #2 > [] print_circular_bug_tail+0x57/0x61 > [] __lock_acquire+0x878/0xb05 > [] lock_acquire+0x5f/0x79 > [] __cancel_work_timer+0x80/0x177 > [] __cancel_work_timer+0xab/0x177 > [] __cancel_work_timer+0x80/0x177 > [] mark_held_locks+0x39/0x53 > [] mutex_lock_nested+0x20f/0x222 > [] trace_hardirqs_on+0xe7/0x10e > [] mutex_lock_nested+0x21a/0x222 > [] cpufreq_governor_dbs+0x25e/0x2ed > [] cpufreq_governor_dbs+0x270/0x2ed > [] __cpufreq_governor+0x73/0xa6 > [] __cpufreq_set_policy+0x129/0x19e > [] store_scaling_governor+0x112/0x135 > [] handle_update+0x0/0x21 > [] atkbd_set_leds+0x9/0xcf > [] store_scaling_governor+0x0/0x135 > [] store+0x3c/0x54 > [] sysfs_write_file+0xa9/0xdd > [] sysfs_write_file+0x0/0xdd > [] vfs_write+0x83/0xf6 > [] sys_write+0x3c/0x63 > [] sysenter_past_esp+0x5f/0xa5 > ======================= > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > While running the script I observed no stack trace but sysfs hang for a particular file like scaling_governor under cpufreq for a particular cpu (this cpu is varying). In the following run I experienced with cpu4. cat /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor The above command is not giving any result and it won't be coming out after pressing ctrl+c or ctrl+z Regards R.Nageswara Sastry -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/