Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751538AbaACKKU (ORCPT ); Fri, 3 Jan 2014 05:10:20 -0500 Received: from mail-ob0-f176.google.com ([209.85.214.176]:44563 "EHLO mail-ob0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750841AbaACKKR (ORCPT ); Fri, 3 Jan 2014 05:10:17 -0500 MIME-Version: 1.0 In-Reply-To: <1388740661-15011-1-git-send-email-jiel@marvell.com> References: <1388740661-15011-1-git-send-email-jiel@marvell.com> Date: Fri, 3 Jan 2014 15:40:16 +0530 Message-ID: Subject: Re: [PATCH v4] cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled From: Viresh Kumar To: Jane Li Cc: "Rafael J. Wysocki" , "cpufreq@vger.kernel.org" , "linux-pm@vger.kernel.org" , Linux Kernel Mailing List Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4915 Lines: 88 On 3 January 2014 14:47, wrote: > From: Jane Li > > When a CPU is hot removed we'll cancel all the delayed work items via > gov_cancel_work(). Sometimes the delayed work function determines that > it should adjust the delay for all other CPUs that the policy is > managing. If this scenario occurs, the canceling CPU will cancel its own > work but queue up the other CPUs works to run. > > Commit 3617f2(cpufreq: Fix timer/workqueue corruption due to double > queueing) has tried to fix this, but reading governor_enabled is not > protected by cpufreq_governor_lock. Even though od_dbs_timer() checks > governor_enabled before gov_queue_work(), this scenario may occur. For > example: > > CPU0 CPU1 > ---- ---- > cpu_down() > ... > __cpufreq_remove_dev() od_dbs_timer() > __cpufreq_governor() policy->governor_enabled > policy->governor_enabled = false; > cpufreq_governor_dbs() > case CPUFREQ_GOV_STOP: > gov_cancel_work(dbs_data, policy); > cpu0 work is canceled > timer is canceled > cpu1 work is canceled > > gov_queue_work(*, *, true); > cpu0 work queued > cpu1 work queued > cpu2 work queued > ... > cpu1 work is canceled > cpu2 work is canceled > ... > > At the end of the GOV_STOP case cpu0 still has a work queued to > run although the code is expecting all of the works to be > canceled. __cpufreq_remove_dev() will then proceed to > re-initialize all the other CPUs works except for the CPU that is > going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs() > will trample over the queued work and debugobjects will spit out > a warning: > > WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc() > ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x14 > Modules linked in: > CPU: 1 PID: 1205 Comm: sh Tainted: G W 3.10.0 #200 > [] (unwind_backtrace+0x0/0xf8) from [] (show_stack+0x10/0x14) > [] (show_stack+0x10/0x14) from [] (warn_slowpath_common+0x4c/0x68) > [] (warn_slowpath_common+0x4c/0x68) from [] (warn_slowpath_fmt+0x30/0x40) > [] (warn_slowpath_fmt+0x30/0x40) from [] (debug_print_object+0x94/0xbc) > [] (debug_print_object+0x94/0xbc) from [] (__debug_object_init+0xc8/0x3c0) > [] (__debug_object_init+0xc8/0x3c0) from [] (init_timer_key+0x20/0x104) > [] (init_timer_key+0x20/0x104) from [] (cpufreq_governor_dbs+0x1dc/0x68c) > [] (cpufreq_governor_dbs+0x1dc/0x68c) from [] (__cpufreq_governor+0x80/0x1b0) > [] (__cpufreq_governor+0x80/0x1b0) from [] (__cpufreq_remove_dev.isra.12+0x22c/0x380) > [] (__cpufreq_remove_dev.isra.12+0x22c/0x380) from [] (cpufreq_cpu_callback+0x48/0x5c) > [] (cpufreq_cpu_callback+0x48/0x5c) from [] (notifier_call_chain+0x44/0x84) > [] (notifier_call_chain+0x44/0x84) from [] (__cpu_notify+0x2c/0x48) > [] (__cpu_notify+0x2c/0x48) from [] (_cpu_down+0x80/0x258) > [] (_cpu_down+0x80/0x258) from [] (cpu_down+0x28/0x3c) > [] (cpu_down+0x28/0x3c) from [] (store_online+0x30/0x74) > [] (store_online+0x30/0x74) from [] (dev_attr_store+0x18/0x24) > [] (dev_attr_store+0x18/0x24) from [] (sysfs_write_file+0x100/0x180) > [] (sysfs_write_file+0x100/0x180) from [] (vfs_write+0xbc/0x184) > [] (vfs_write+0xbc/0x184) from [] (SyS_write+0x40/0x68) > [] (SyS_write+0x40/0x68) from [] (ret_fast_syscall+0x0/0x48) > > In gov_queue_work(), lock cpufreq_governor_lock before gov_queue_work, > and unlock it after __gov_queue_work(). In this way, governor_enabled > is guaranteed not changed in gov_queue_work(). > > Signed-off-by: Jane Li > --- > drivers/cpufreq/cpufreq.c | 2 +- > drivers/cpufreq/cpufreq_governor.c | 6 +++++- > drivers/cpufreq/cpufreq_governor.h | 2 ++ > 3 files changed, 8 insertions(+), 2 deletions(-) Acked-by: Viresh Kumar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/