2013-06-25 21:16:04

by Sergey Senozhatsky

[permalink] [raw]
Subject: [LOCKDEP] cpufreq: possible circular locking dependency detected


[ 60.277396] ======================================================
[ 60.277400] [ INFO: possible circular locking dependency detected ]
[ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
[ 60.277411] -------------------------------------------------------
[ 60.277417] bash/2225 is trying to acquire lock:
[ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
[ 60.277444]
but task is already holding lock:
[ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[ 60.277465]
which lock already depends on the new lock.

[ 60.277472]
the existing dependency chain (in reverse order) is:
[ 60.277477]
-> #2 (cpu_hotplug.lock){+.+.+.}:
[ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
[ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
[ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
[ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[ 60.277580]
-> #1 (&j_cdbs->timer_mutex){+.+...}:
[ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
[ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[ 60.277649]
-> #0 ((&(&j_cdbs->work)->work)){+.+...}:
[ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280
[ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[ 60.277693] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[ 60.277701] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[ 60.277709] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[ 60.277719] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[ 60.277728] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[ 60.277737] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[ 60.277747] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[ 60.277759] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[ 60.277768] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50
[ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0
[ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[ 60.277842]
other info that might help us debug this:

[ 60.277848] Chain exists of:
(&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock

[ 60.277864] Possible unsafe locking scenario:

[ 60.277869] CPU0 CPU1
[ 60.277873] ---- ----
[ 60.277877] lock(cpu_hotplug.lock);
[ 60.277885] lock(&j_cdbs->timer_mutex);
[ 60.277892] lock(cpu_hotplug.lock);
[ 60.277900] lock((&(&j_cdbs->work)->work));
[ 60.277907]
*** DEADLOCK ***

[ 60.277915] 6 locks held by bash/2225:
[ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
[ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
[ 60.277954] #2: (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
[ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
[ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
[ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[ 60.278023]
stack backtrace:
[ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
[ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
[ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
[ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
[ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
[ 60.278081] Call Trace:
[ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
[ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
[ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
[ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
[ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
[ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
[ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
[ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
[ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[ 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[ 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[ 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
[ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
[ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
[ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
[ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[ 60.280582] smpboot: CPU 1 is now offline


-ss


2013-06-28 04:43:35

by Viresh Kumar

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 26 June 2013 02:45, Sergey Senozhatsky <[email protected]> wrote:
>
> [ 60.277396] ======================================================
> [ 60.277400] [ INFO: possible circular locking dependency detected ]
> [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
> [ 60.277411] -------------------------------------------------------
> [ 60.277417] bash/2225 is trying to acquire lock:
> [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
> [ 60.277444]
> but task is already holding lock:
> [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> [ 60.277465]
> which lock already depends on the new lock.

Hi Sergey,

Can you try reverting this patch?

commit 2f7021a815f20f3481c10884fe9735ce2a56db35
Author: Michael Wang <[email protected]>
Date: Wed Jun 5 08:49:37 2013 +0000

cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()

2013-06-28 07:44:26

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH] cpu hotplug: rework cpu_hotplug locking (was [LOCKDEP] cpufreq: possible circular locking dependency detected)

On (06/28/13 10:13), Viresh Kumar wrote:
> On 26 June 2013 02:45, Sergey Senozhatsky <[email protected]> wrote:
> >
> > [ 60.277396] ======================================================
> > [ 60.277400] [ INFO: possible circular locking dependency detected ]
> > [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
> > [ 60.277411] -------------------------------------------------------
> > [ 60.277417] bash/2225 is trying to acquire lock:
> > [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
> > [ 60.277444]
> > but task is already holding lock:
> > [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> > [ 60.277465]
> > which lock already depends on the new lock.
>
> Hi Sergey,
>
> Can you try reverting this patch?
>
> commit 2f7021a815f20f3481c10884fe9735ce2a56db35
> Author: Michael Wang <[email protected]>
> Date: Wed Jun 5 08:49:37 2013 +0000
>
> cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()
>

Hello,
Yes, this helps, of course, but at the same time it returns the previous
problem -- preventing cpu_hotplug in some places.


I have a bit different (perhaps naive) RFC patch and would like to hear
comments.



The idead is to brake existing lock dependency chain by not holding
cpu_hotplug lock mutex across the calls. In order to detect active
refcount readers or active writer, refcount now may have the following
values:

-1: active writer -- only one writer may be active, readers are blocked
0: no readers/writer
>0: active readers -- many readers may be active, writer is blocked

"blocked" reader or writer goes to wait_queue. as soon as writer finishes
(refcount becomes 0), it wakeups all existing processes in a wait_queue.
reader perform wakeup call only when it sees that pending writer is present
(active_writer is not NULL).

cpu_hotplug lock now only required to protect refcount cmp, inc, dec
operations so it can be changed to spinlock.

The patch has survived the initial beating:

echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo 1 > /sys/devices/system/cpu/cpu3/online


Signed-off-by: Sergey Senozhatsky <[email protected]>

---

kernel/cpu.c | 75 ++++++++++++++++++++++++++++++++++++++++++++----------------
1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 198a388..7fa7b0f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -50,17 +50,25 @@ static int cpu_hotplug_disabled;
#ifdef CONFIG_HOTPLUG_CPU

static struct {
- struct task_struct *active_writer;
- struct mutex lock; /* Synchronizes accesses to refcount, */
- /*
- * Also blocks the new readers during
- * an ongoing cpu hotplug operation.
+ /* Synchronizes accesses to refcount, also blocks the new readers
+ * during an ongoing cpu hotplug operation.
+ */
+ spinlock_t lock;
+ /* -1: active cpu hotplug process
+ * 0: unlocked
+ * >0: active fercount readers
*/
int refcount;
+ struct task_struct *active_writer;
+ /* Wait queue for new refcount readers during an ongoing
+ * cpu hotplug operation.
+ */
+ wait_queue_head_t wait;
} cpu_hotplug = {
- .active_writer = NULL,
- .lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
+ .lock = __SPIN_LOCK_INITIALIZER(cpu_hotplug.lock),
.refcount = 0,
+ .active_writer = NULL,
+ .wait = __WAIT_QUEUE_HEAD_INITIALIZER(cpu_hotplug.wait),
};

void get_online_cpus(void)
@@ -68,10 +76,24 @@ void get_online_cpus(void)
might_sleep();
if (cpu_hotplug.active_writer == current)
return;
- mutex_lock(&cpu_hotplug.lock);
- cpu_hotplug.refcount++;
- mutex_unlock(&cpu_hotplug.lock);

+ for (;;) {
+ DECLARE_WAITQUEUE(wait, current);
+
+ spin_lock(&cpu_hotplug.lock);
+ if (++cpu_hotplug.refcount > 0) {
+ spin_unlock(&cpu_hotplug.lock);
+ break;
+ }
+ /* Ongoing cpu hotplug process */
+ cpu_hotplug.refcount--;
+ add_wait_queue(&cpu_hotplug.wait, &wait);
+ __set_current_state(TASK_UNINTERRUPTIBLE);
+
+ spin_unlock(&cpu_hotplug.lock);
+ schedule();
+ remove_wait_queue(&cpu_hotplug.wait, &wait);
+ }
}
EXPORT_SYMBOL_GPL(get_online_cpus);

@@ -79,15 +101,15 @@ void put_online_cpus(void)
{
if (cpu_hotplug.active_writer == current)
return;
- mutex_lock(&cpu_hotplug.lock);
+ spin_lock(&cpu_hotplug.lock);

- if (WARN_ON(!cpu_hotplug.refcount))
+ if (WARN_ON(cpu_hotplug.refcount == 0))
cpu_hotplug.refcount++; /* try to fix things up */
+ cpu_hotplug.refcount--;

- if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
- wake_up_process(cpu_hotplug.active_writer);
- mutex_unlock(&cpu_hotplug.lock);
-
+ if (unlikely(cpu_hotplug.active_writer))
+ wake_up(&cpu_hotplug.wait);
+ spin_unlock(&cpu_hotplug.lock);
}
EXPORT_SYMBOL_GPL(put_online_cpus);

@@ -118,19 +140,32 @@ static void cpu_hotplug_begin(void)
cpu_hotplug.active_writer = current;

for (;;) {
- mutex_lock(&cpu_hotplug.lock);
- if (likely(!cpu_hotplug.refcount))
+ DECLARE_WAITQUEUE(wait, current);
+
+ spin_lock(&cpu_hotplug.lock);
+ if (likely(--cpu_hotplug.refcount == -1)) {
+ spin_unlock(&cpu_hotplug.lock);
break;
+ }
+ /* Refcount readers present */
+ cpu_hotplug.refcount++;
+ add_wait_queue(&cpu_hotplug.wait, &wait);
__set_current_state(TASK_UNINTERRUPTIBLE);
- mutex_unlock(&cpu_hotplug.lock);
+
+ spin_unlock(&cpu_hotplug.lock);
schedule();
+ remove_wait_queue(&cpu_hotplug.wait, &wait);
}
}

static void cpu_hotplug_done(void)
{
+ spin_lock(&cpu_hotplug.lock);
cpu_hotplug.active_writer = NULL;
- mutex_unlock(&cpu_hotplug.lock);
+ cpu_hotplug.refcount++;
+ spin_unlock(&cpu_hotplug.lock);
+
+ wake_up(&cpu_hotplug.wait);
}

/*

2013-06-28 09:35:16

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [RFC PATCH] cpu hotplug: rework cpu_hotplug locking (was [LOCKDEP] cpufreq: possible circular locking dependency detected)

On 06/28/2013 01:14 PM, Sergey Senozhatsky wrote:
> On (06/28/13 10:13), Viresh Kumar wrote:
>> On 26 June 2013 02:45, Sergey Senozhatsky <[email protected]> wrote:
>>>
>>> [ 60.277396] ======================================================
>>> [ 60.277400] [ INFO: possible circular locking dependency detected ]
>>> [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
>>> [ 60.277411] -------------------------------------------------------
>>> [ 60.277417] bash/2225 is trying to acquire lock:
>>> [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
>>> [ 60.277444]
>>> but task is already holding lock:
>>> [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
>>> [ 60.277465]
>>> which lock already depends on the new lock.
>>
>> Hi Sergey,
>>
>> Can you try reverting this patch?
>>
>> commit 2f7021a815f20f3481c10884fe9735ce2a56db35
>> Author: Michael Wang <[email protected]>
>> Date: Wed Jun 5 08:49:37 2013 +0000
>>
>> cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()
>>
>
> Hello,
> Yes, this helps, of course, but at the same time it returns the previous
> problem -- preventing cpu_hotplug in some places.
>
>
> I have a bit different (perhaps naive) RFC patch and would like to hear
> comments.
>
>
>
> The idead is to brake existing lock dependency chain by not holding
> cpu_hotplug lock mutex across the calls. In order to detect active
> refcount readers or active writer, refcount now may have the following
> values:
>
> -1: active writer -- only one writer may be active, readers are blocked
> 0: no readers/writer
>> 0: active readers -- many readers may be active, writer is blocked
>
> "blocked" reader or writer goes to wait_queue. as soon as writer finishes
> (refcount becomes 0), it wakeups all existing processes in a wait_queue.
> reader perform wakeup call only when it sees that pending writer is present
> (active_writer is not NULL).
>
> cpu_hotplug lock now only required to protect refcount cmp, inc, dec
> operations so it can be changed to spinlock.
>

Its best to avoid changing the core infrastructure in order to fix some
call-site, unless that scenario is really impossible to handle with the
current infrastructure.

I have a couple of suggestions below, to solve this issue, without touching
the core hotplug code:

You can perhaps try cancelling the work item in two steps:
a. using cancel_delayed_work() under CPU_DOWN_PREPARE
b. using cancel_delayed_work_sync() under CPU_POST_DEAD

And of course, destroy the resources associated with that work (like
the timer_mutex) only after the full tear-down.

Or perhaps you might find a way to perform the tear-down in just one step
at the CPU_POST_DEAD stage. Whatever works correctly.

The key point here is that the core CPU hotplug code provides us with the
CPU_POST_DEAD stage, where the hotplug lock is _not_ held. Which is exactly
what you want in solving the issue with cpufreq.

Regards,
Srivatsa S. Bhat

2013-06-28 10:04:44

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH] cpu hotplug: rework cpu_hotplug locking (was [LOCKDEP] cpufreq: possible circular locking dependency detected)

On (06/28/13 15:01), Srivatsa S. Bhat wrote:
> On 06/28/2013 01:14 PM, Sergey Senozhatsky wrote:
> > On (06/28/13 10:13), Viresh Kumar wrote:
> >> On 26 June 2013 02:45, Sergey Senozhatsky <[email protected]> wrote:
> >>>
> >>> [ 60.277396] ======================================================
> >>> [ 60.277400] [ INFO: possible circular locking dependency detected ]
> >>> [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
> >>> [ 60.277411] -------------------------------------------------------
> >>> [ 60.277417] bash/2225 is trying to acquire lock:
> >>> [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
> >>> [ 60.277444]
> >>> but task is already holding lock:
> >>> [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> >>> [ 60.277465]
> >>> which lock already depends on the new lock.
> >>
> >> Hi Sergey,
> >>
> >> Can you try reverting this patch?
> >>
> >> commit 2f7021a815f20f3481c10884fe9735ce2a56db35
> >> Author: Michael Wang <[email protected]>
> >> Date: Wed Jun 5 08:49:37 2013 +0000
> >>
> >> cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()
> >>
> >
> > Hello,
> > Yes, this helps, of course, but at the same time it returns the previous
> > problem -- preventing cpu_hotplug in some places.
> >
> >
> > I have a bit different (perhaps naive) RFC patch and would like to hear
> > comments.
> >
> >
> >
> > The idead is to brake existing lock dependency chain by not holding
> > cpu_hotplug lock mutex across the calls. In order to detect active
> > refcount readers or active writer, refcount now may have the following
> > values:
> >
> > -1: active writer -- only one writer may be active, readers are blocked
> > 0: no readers/writer
> >> 0: active readers -- many readers may be active, writer is blocked
> >
> > "blocked" reader or writer goes to wait_queue. as soon as writer finishes
> > (refcount becomes 0), it wakeups all existing processes in a wait_queue.
> > reader perform wakeup call only when it sees that pending writer is present
> > (active_writer is not NULL).
> >
> > cpu_hotplug lock now only required to protect refcount cmp, inc, dec
> > operations so it can be changed to spinlock.
> >
>
> Its best to avoid changing the core infrastructure in order to fix some
> call-site, unless that scenario is really impossible to handle with the
> current infrastructure.
>
> I have a couple of suggestions below, to solve this issue, without touching
> the core hotplug code:
>
> You can perhaps try cancelling the work item in two steps:
> a. using cancel_delayed_work() under CPU_DOWN_PREPARE
> b. using cancel_delayed_work_sync() under CPU_POST_DEAD
>
> And of course, destroy the resources associated with that work (like
> the timer_mutex) only after the full tear-down.
>
> Or perhaps you might find a way to perform the tear-down in just one step
> at the CPU_POST_DEAD stage. Whatever works correctly.
>
> The key point here is that the core CPU hotplug code provides us with the
> CPU_POST_DEAD stage, where the hotplug lock is _not_ held. Which is exactly
> what you want in solving the issue with cpufreq.
>

Thanks for your ideas, I'll take a look.

cpu_hotplug mutex seems to be a troubling part in several places, not only
cpufreq. for example:
https://lkml.org/lkml/2012/12/20/357


-ss

> Regards,
> Srivatsa S. Bhat
>

2013-06-28 14:17:14

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [RFC PATCH] cpu hotplug: rework cpu_hotplug locking (was [LOCKDEP] cpufreq: possible circular locking dependency detected)

On 06/28/2013 01:14 PM, Sergey Senozhatsky wrote:
> Hello,
> Yes, this helps, of course, but at the same time it returns the previous
> problem -- preventing cpu_hotplug in some places.
>
>
> I have a bit different (perhaps naive) RFC patch and would like to hear
> comments.
>
>
>
> The idead is to brake existing lock dependency chain by not holding
> cpu_hotplug lock mutex across the calls. In order to detect active
> refcount readers or active writer, refcount now may have the following
> values:
>
> -1: active writer -- only one writer may be active, readers are blocked
> 0: no readers/writer
>> 0: active readers -- many readers may be active, writer is blocked
>
> "blocked" reader or writer goes to wait_queue. as soon as writer finishes
> (refcount becomes 0), it wakeups all existing processes in a wait_queue.
> reader perform wakeup call only when it sees that pending writer is present
> (active_writer is not NULL).
>
> cpu_hotplug lock now only required to protect refcount cmp, inc, dec
> operations so it can be changed to spinlock.
>

Hmm, now that I actually looked at your patch, I see that it is completely
wrong! I'm sure you intended to fix the *bug*, but instead you ended
up merely silencing the *warning* (and also left lockdep blind), leaving
the actual bug as it is!

So let me summarize what the actual bug is and what is it that actually
needs fixing:

Basically you have 2 things -
1. A worker item (cs_dbs_timer in this case) that can re-arm itself
using gov_queue_work(). And gov_queue_work() uses get/put_online_cpus().

2. In the cpu_down() path, you want to cancel the worker item and destroy
and cleanup its resources (the timer_mutex).

So the problem is that you can deadlock like this:

CPU 3 CPU 4

cpu_down()
-> acquire hotplug.lock

cs_dbs_timer()
-> get_online_cpus()
//wait on hotplug.lock


try to cancel cs_dbs_timer()
synchronously.

That leads to a deadlock, because, cs_dbs_timer() is waiting to
get the hotplug lock which CPU 3 is holding, whereas CPU 3 is
waiting for cs_dbs_timer() to finish. So they can end up mutually
waiting for each other, forever. (Yeah, the lockdep splat might have
been a bit cryptic to decode this, but here it is).

So to fix the *bug*, you need to avoid waiting synchronously while
holding the hotplug lock. Possibly by using cancel_delayed_work_sync()
under CPU_POST_DEAD or something like that. That would remove the deadlock
possibility.

Your patch, on the other hand, doesn't remove the deadlock possibility:
just because you don't hold the lock throughout the hotplug operation
doesn't mean that the task calling get_online_cpus() can sneak in and
finish its work in-between a hotplug operation (because the refcount
won't allow it to). Also, it should *not* be allowed to sneak in like
that, since that constitutes *racing* with CPU hotplug, which it was
meant to avoid!.

Also, as a side effect of not holding the lock throughout the hotplug
operation, lockdep goes blind, and doesn't complain, even though the
actual bug is still there! Effectively, this is nothing but papering
over the bug and silencing the warning, which we should never do.

So, please, fix the _cpufreq_ code to resolve the deadlock.

Regards,
Srivatsa S. Bhat

2013-06-29 07:36:23

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH] cpu hotplug: rework cpu_hotplug locking (was [LOCKDEP] cpufreq: possible circular locking dependency detected)

On (06/28/13 19:43), Srivatsa S. Bhat wrote:
> On 06/28/2013 01:14 PM, Sergey Senozhatsky wrote:
> > Hello,
> > Yes, this helps, of course, but at the same time it returns the previous
> > problem -- preventing cpu_hotplug in some places.
> >
> >
> > I have a bit different (perhaps naive) RFC patch and would like to hear
> > comments.
> >
> >
> >
> > The idead is to brake existing lock dependency chain by not holding
> > cpu_hotplug lock mutex across the calls. In order to detect active
> > refcount readers or active writer, refcount now may have the following
> > values:
> >
> > -1: active writer -- only one writer may be active, readers are blocked
> > 0: no readers/writer
> >> 0: active readers -- many readers may be active, writer is blocked
> >
> > "blocked" reader or writer goes to wait_queue. as soon as writer finishes
> > (refcount becomes 0), it wakeups all existing processes in a wait_queue.
> > reader perform wakeup call only when it sees that pending writer is present
> > (active_writer is not NULL).
> >
> > cpu_hotplug lock now only required to protect refcount cmp, inc, dec
> > operations so it can be changed to spinlock.
> >
>
> Hmm, now that I actually looked at your patch, I see that it is completely
> wrong! I'm sure you intended to fix the *bug*, but instead you ended
> up merely silencing the *warning* (and also left lockdep blind), leaving
> the actual bug as it is!
>

Thank you for your time and review.


> So let me summarize what the actual bug is and what is it that actually
> needs fixing:
>
> Basically you have 2 things -
> 1. A worker item (cs_dbs_timer in this case) that can re-arm itself
> using gov_queue_work(). And gov_queue_work() uses get/put_online_cpus().
>
> 2. In the cpu_down() path, you want to cancel the worker item and destroy
> and cleanup its resources (the timer_mutex).
>
> So the problem is that you can deadlock like this:
>
> CPU 3 CPU 4
>
> cpu_down()
> -> acquire hotplug.lock
>
> cs_dbs_timer()
> -> get_online_cpus()
> //wait on hotplug.lock
>
>
> try to cancel cs_dbs_timer()
> synchronously.
>
> That leads to a deadlock, because, cs_dbs_timer() is waiting to
> get the hotplug lock which CPU 3 is holding, whereas CPU 3 is
> waiting for cs_dbs_timer() to finish. So they can end up mutually
> waiting for each other, forever. (Yeah, the lockdep splat might have
> been a bit cryptic to decode this, but here it is).
>
> So to fix the *bug*, you need to avoid waiting synchronously while
> holding the hotplug lock. Possibly by using cancel_delayed_work_sync()
> under CPU_POST_DEAD or something like that. That would remove the deadlock
> possibility.

will take a look. Thank you!

-ss

> Your patch, on the other hand, doesn't remove the deadlock possibility:
> just because you don't hold the lock throughout the hotplug operation
> doesn't mean that the task calling get_online_cpus() can sneak in and
> finish its work in-between a hotplug operation (because the refcount
> won't allow it to). Also, it should *not* be allowed to sneak in like
> that, since that constitutes *racing* with CPU hotplug, which it was
> meant to avoid!.
>
> Also, as a side effect of not holding the lock throughout the hotplug
> operation, lockdep goes blind, and doesn't complain, even though the
> actual bug is still there! Effectively, this is nothing but papering
> over the bug and silencing the warning, which we should never do.
>
> So, please, fix the _cpufreq_ code to resolve the deadlock.
>
> Regards,
> Srivatsa S. Bhat
>

2013-07-01 04:42:14

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

Hi, Sergey

On 06/26/2013 05:15 AM, Sergey Senozhatsky wrote:
[snip]
>
> [ 60.277848] Chain exists of:
> (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
>
> [ 60.277864] Possible unsafe locking scenario:
>
> [ 60.277869] CPU0 CPU1
> [ 60.277873] ---- ----
> [ 60.277877] lock(cpu_hotplug.lock);
> [ 60.277885] lock(&j_cdbs->timer_mutex);
> [ 60.277892] lock(cpu_hotplug.lock);
> [ 60.277900] lock((&(&j_cdbs->work)->work));
> [ 60.277907]
> *** DEADLOCK ***

It may caused by that 'j_cdbs->work.work' and 'j_cdbs->timer_mutex'
has the same lock class, although they are different lock...

This may help fix the issue:

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 5af40ad..aa05eaa 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -229,6 +229,8 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
}
}

+static struct lock_class_key j_cdbs_key;
+
int cpufreq_governor_dbs(struct cpufreq_policy *policy,
struct common_dbs_data *cdata, unsigned int event)
{
@@ -366,6 +368,8 @@ int (struct cpufreq_policy *policy,
kcpustat_cpu(j).cpustat[CPUTIME_NICE];

mutex_init(&j_cdbs->timer_mutex);
+ lockdep_set_class(&j_cdbs->timer_mutex, &j_cdbs_key);
+
INIT_DEFERRABLE_WORK(&j_cdbs->work,
dbs_data->cdata->gov_dbs_timer);
}

Would you like to take a try?

Regards,
Michael Wang

>
> [ 60.277915] 6 locks held by bash/2225:
> [ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
> [ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
> [ 60.277954] #2: (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
> [ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
> [ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
> [ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> [ 60.278023]
> stack backtrace:
> [ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
> [ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
> [ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
> [ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
> [ 60.278081] Call Trace:
> [ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
> [ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
> [ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
> [ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
> [ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
> [ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
> [ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
> [ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
> [ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
> [ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
> [ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
> [ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
> [ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
> [ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
> [ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
> [ 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
> [ 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
> [ 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
> [ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
> [ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
> [ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
> [ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
> [ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
> [ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
> [ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
> [ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
> [ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
> [ 60.280582] smpboot: CPU 1 is now offline
>
>
> -ss
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-10 23:13:41

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/01/13 12:42), Michael Wang wrote:
> On 06/26/2013 05:15 AM, Sergey Senozhatsky wrote:
> [snip]
> >
> > [ 60.277848] Chain exists of:
> > (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
> >
> > [ 60.277864] Possible unsafe locking scenario:
> >
> > [ 60.277869] CPU0 CPU1
> > [ 60.277873] ---- ----
> > [ 60.277877] lock(cpu_hotplug.lock);
> > [ 60.277885] lock(&j_cdbs->timer_mutex);
> > [ 60.277892] lock(cpu_hotplug.lock);
> > [ 60.277900] lock((&(&j_cdbs->work)->work));
> > [ 60.277907]
> > *** DEADLOCK ***
>
> It may caused by that 'j_cdbs->work.work' and 'j_cdbs->timer_mutex'
> has the same lock class, although they are different lock...
>
> This may help fix the issue:
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad..aa05eaa 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -229,6 +229,8 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
> }
> }
>
> +static struct lock_class_key j_cdbs_key;
> +
> int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> struct common_dbs_data *cdata, unsigned int event)
> {
> @@ -366,6 +368,8 @@ int (struct cpufreq_policy *policy,
> kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>
> mutex_init(&j_cdbs->timer_mutex);
> + lockdep_set_class(&j_cdbs->timer_mutex, &j_cdbs_key);
> +
> INIT_DEFERRABLE_WORK(&j_cdbs->work,
> dbs_data->cdata->gov_dbs_timer);
> }
>
> Would you like to take a try?
>

Hello,
sorry for long reply. unfortunately it seems it doesn't.


Please kindly review the following patch.



Remove cpu device only upon succesful cpu down on CPU_POST_DEAD event,
so we can kill off CPU_DOWN_FAILED case and eliminate potential extra
remove/add path:

hotplug lock
CPU_DOWN_PREPARE: __cpufreq_remove_dev
CPU_DOWN_FAILED: cpufreq_add_dev
hotplug unlock

Since cpu still present on CPU_DEAD event, cpu stats table should be
kept longer and removed later on CPU_POST_DEAD as well.

Because CPU_POST_DEAD action performed with hotplug lock released, CPU_DOWN
might block existing gov_queue_work() user (blocked on get_online_cpus())
and unblock it with one of policy->cpus offlined, thus cpu_is_offline()
check is performed in __gov_queue_work().

Besides, existing gov_queue_work() hotplug guard extended to protect all
__gov_queue_work() calls: for both all_cpus and !all_cpus cases.

CPUFREQ_GOV_START performs direct __gov_queue_work() call because hotplug
lock already held there, opposing to previous gov_queue_work() and nested
get/put_online_cpus().

Signed-off-by: Sergey Senozhatsky <[email protected]>

---

drivers/cpufreq/cpufreq.c | 5 +----
drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
drivers/cpufreq/cpufreq_stats.c | 2 +-
3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 6a015ad..f8aacf1 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
case CPU_ONLINE:
cpufreq_add_dev(dev, NULL);
break;
- case CPU_DOWN_PREPARE:
+ case CPU_POST_DEAD:
case CPU_UP_CANCELED_FROZEN:
__cpufreq_remove_dev(dev, NULL);
break;
- case CPU_DOWN_FAILED:
- cpufreq_add_dev(dev, NULL);
- break;
}
}
return NOTIFY_OK;
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 4645876..681d5d6 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
unsigned int delay)
{
struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
-
+ /* cpu offline might block existing gov_queue_work() user,
+ * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
+ * thus potentially we can hit offlined CPU */
+ if (unlikely(cpu_is_offline(cpu)))
+ return;
mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
}

@@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
unsigned int delay, bool all_cpus)
{
int i;
-
+ get_online_cpus();
if (!all_cpus) {
__gov_queue_work(smp_processor_id(), dbs_data, delay);
} else {
- get_online_cpus();
for_each_cpu(i, policy->cpus)
__gov_queue_work(i, dbs_data, delay);
- put_online_cpus();
}
+ put_online_cpus();
}
EXPORT_SYMBOL_GPL(gov_queue_work);

@@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
/* Initiate timer time stamp */
cpu_cdbs->time_stamp = ktime_get();

- gov_queue_work(dbs_data, policy,
- delay_for_sampling_rate(sampling_rate), true);
+ /* hotplug lock already held */
+ for_each_cpu(j, policy->cpus)
+ __gov_queue_work(j, dbs_data,
+ delay_for_sampling_rate(sampling_rate));
break;

case CPUFREQ_GOV_STOP:
diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index cd9e817..833816e 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
case CPU_DOWN_PREPARE:
cpufreq_stats_free_sysfs(cpu);
break;
- case CPU_DEAD:
+ case CPU_POST_DEAD:
cpufreq_stats_free_table(cpu);
break;
case CPU_UP_CANCELED_FROZEN:

2013-07-11 02:43:57

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

Hi, Sergey

On 07/11/2013 07:13 AM, Sergey Senozhatsky wrote:
[snip]
>
>
> Please kindly review the following patch.
>
>
>
> Remove cpu device only upon succesful cpu down on CPU_POST_DEAD event,
> so we can kill off CPU_DOWN_FAILED case and eliminate potential extra
> remove/add path:
>
> hotplug lock
> CPU_DOWN_PREPARE: __cpufreq_remove_dev
> CPU_DOWN_FAILED: cpufreq_add_dev
> hotplug unlock
>
> Since cpu still present on CPU_DEAD event, cpu stats table should be
> kept longer and removed later on CPU_POST_DEAD as well.
>
> Because CPU_POST_DEAD action performed with hotplug lock released, CPU_DOWN
> might block existing gov_queue_work() user (blocked on get_online_cpus())
> and unblock it with one of policy->cpus offlined, thus cpu_is_offline()
> check is performed in __gov_queue_work().
>
> Besides, existing gov_queue_work() hotplug guard extended to protect all
> __gov_queue_work() calls: for both all_cpus and !all_cpus cases.
>
> CPUFREQ_GOV_START performs direct __gov_queue_work() call because hotplug
> lock already held there, opposing to previous gov_queue_work() and nested
> get/put_online_cpus().

Nice to know you have some idea on solving the issue ;-)

I'm not sure whether I catch the idea, but seems like you are trying
to re-organize the timing of add/remove device.

I'm sure that we have more than one way to solve the issues, but what
we need is the cure of root...

As Srivatsa discovered, the root issue may be:
gov_cancel_work() failed to stop all the work after it's return.

And Viresh also confirmed that this is not by-designed.

Which means gov_queue_work() invoked by od_dbs_timer() is supposed to
never happen after CPUFREQ_GOV_STOP notify, the whole policy should
stop working at that time.

But it failed to, and the work concurrent with cpu dying caused the
first problem.

Thus I think we should focus on this and suggested below fix, I'd like
to know your opinions :)

Regards,
Michael Wang

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index dc9b72e..a64b544 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
{
int i;

+ if (dbs_data->queue_stop)
+ return;
+
if (!all_cpus) {
__gov_queue_work(smp_processor_id(), dbs_data, delay);
} else {
- get_online_cpus();
for_each_cpu(i, policy->cpus)
__gov_queue_work(i, dbs_data, delay);
- put_online_cpus();
}
}
EXPORT_SYMBOL_GPL(gov_queue_work);
@@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
struct cpufreq_policy *policy)
{
struct cpu_dbs_common_info *cdbs;
- int i;
+ int i, round = 2;

+ dbs_data->queue_stop = 1;
+redo:
+ round--;
for_each_cpu(i, policy->cpus) {
cdbs = dbs_data->cdata->get_cpu_cdbs(i);
cancel_delayed_work_sync(&cdbs->work);
}
+
+ /*
+ * Since there is no lock to prvent re-queue the
+ * cancelled work, some early cancelled work might
+ * have been queued again by later cancelled work.
+ *
+ * Flush the work again with dbs_data->queue_stop
+ * enabled, this time there will be no survivors.
+ */
+ if (round)
+ goto redo;
+ dbs_data->queue_stop = 0;
}

/* Will return if we need to evaluate cpu load again or not */
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index e16a961..9116135 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -213,6 +213,7 @@ struct dbs_data {
unsigned int min_sampling_rate;
int usage_count;
void *tuners;
+ int queue_stop;

/* dbs_mutex protects dbs_enable in governor start/stop */
struct mutex mutex;

>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
>
> ---
>
> drivers/cpufreq/cpufreq.c | 5 +----
> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
> drivers/cpufreq/cpufreq_stats.c | 2 +-
> 3 files changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 6a015ad..f8aacf1 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> case CPU_ONLINE:
> cpufreq_add_dev(dev, NULL);
> break;
> - case CPU_DOWN_PREPARE:
> + case CPU_POST_DEAD:
> case CPU_UP_CANCELED_FROZEN:
> __cpufreq_remove_dev(dev, NULL);
> break;
> - case CPU_DOWN_FAILED:
> - cpufreq_add_dev(dev, NULL);
> - break;
> }
> }
> return NOTIFY_OK;
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 4645876..681d5d6 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> unsigned int delay)
> {
> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
> -
> + /* cpu offline might block existing gov_queue_work() user,
> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
> + * thus potentially we can hit offlined CPU */
> + if (unlikely(cpu_is_offline(cpu)))
> + return;
> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> }
>
> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> unsigned int delay, bool all_cpus)
> {
> int i;
> -
> + get_online_cpus();
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> + put_online_cpus();
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
>
> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> /* Initiate timer time stamp */
> cpu_cdbs->time_stamp = ktime_get();
>
> - gov_queue_work(dbs_data, policy,
> - delay_for_sampling_rate(sampling_rate), true);
> + /* hotplug lock already held */
> + for_each_cpu(j, policy->cpus)
> + __gov_queue_work(j, dbs_data,
> + delay_for_sampling_rate(sampling_rate));
> break;
>
> case CPUFREQ_GOV_STOP:
> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> index cd9e817..833816e 100644
> --- a/drivers/cpufreq/cpufreq_stats.c
> +++ b/drivers/cpufreq/cpufreq_stats.c
> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> case CPU_DOWN_PREPARE:
> cpufreq_stats_free_sysfs(cpu);
> break;
> - case CPU_DEAD:
> + case CPU_POST_DEAD:
> cpufreq_stats_free_table(cpu);
> break;
> case CPU_UP_CANCELED_FROZEN:
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-11 08:22:58

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/11/13 10:43), Michael Wang wrote:
> Hi, Sergey
>
> On 07/11/2013 07:13 AM, Sergey Senozhatsky wrote:
> [snip]
> >
> >
> > Please kindly review the following patch.
> >
> >
> >
> > Remove cpu device only upon succesful cpu down on CPU_POST_DEAD event,
> > so we can kill off CPU_DOWN_FAILED case and eliminate potential extra
> > remove/add path:
> >
> > hotplug lock
> > CPU_DOWN_PREPARE: __cpufreq_remove_dev
> > CPU_DOWN_FAILED: cpufreq_add_dev
> > hotplug unlock
> >
> > Since cpu still present on CPU_DEAD event, cpu stats table should be
> > kept longer and removed later on CPU_POST_DEAD as well.
> >
> > Because CPU_POST_DEAD action performed with hotplug lock released, CPU_DOWN
> > might block existing gov_queue_work() user (blocked on get_online_cpus())
> > and unblock it with one of policy->cpus offlined, thus cpu_is_offline()
> > check is performed in __gov_queue_work().
> >
> > Besides, existing gov_queue_work() hotplug guard extended to protect all
> > __gov_queue_work() calls: for both all_cpus and !all_cpus cases.
> >
> > CPUFREQ_GOV_START performs direct __gov_queue_work() call because hotplug
> > lock already held there, opposing to previous gov_queue_work() and nested
> > get/put_online_cpus().
>
> Nice to know you have some idea on solving the issue ;-)
>
> I'm not sure whether I catch the idea, but seems like you are trying
> to re-organize the timing of add/remove device.
>
> I'm sure that we have more than one way to solve the issues, but what
> we need is the cure of root...
>
> As Srivatsa discovered, the root issue may be:
> gov_cancel_work() failed to stop all the work after it's return.
>
> And Viresh also confirmed that this is not by-designed.
>
> Which means gov_queue_work() invoked by od_dbs_timer() is supposed to
> never happen after CPUFREQ_GOV_STOP notify, the whole policy should
> stop working at that time.
>
> But it failed to, and the work concurrent with cpu dying caused the
> first problem.
>
> Thus I think we should focus on this and suggested below fix, I'd like
> to know your opinions :)
>

Hello Michael,
nice job! works fine for me.

Reported-and-Tested-by: Sergey Senozhatsky <[email protected]>


-ss

> Regards,
> Michael Wang
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index dc9b72e..a64b544 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> {
> int i;
>
> + if (dbs_data->queue_stop)
> + return;
> +
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> struct cpufreq_policy *policy)
> {
> struct cpu_dbs_common_info *cdbs;
> - int i;
> + int i, round = 2;
>
> + dbs_data->queue_stop = 1;
> +redo:
> + round--;
> for_each_cpu(i, policy->cpus) {
> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> cancel_delayed_work_sync(&cdbs->work);
> }
> +
> + /*
> + * Since there is no lock to prvent re-queue the
> + * cancelled work, some early cancelled work might
> + * have been queued again by later cancelled work.
> + *
> + * Flush the work again with dbs_data->queue_stop
> + * enabled, this time there will be no survivors.
> + */
> + if (round)
> + goto redo;
> + dbs_data->queue_stop = 0;
> }
>
> /* Will return if we need to evaluate cpu load again or not */
> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index e16a961..9116135 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -213,6 +213,7 @@ struct dbs_data {
> unsigned int min_sampling_rate;
> int usage_count;
> void *tuners;
> + int queue_stop;
>
> /* dbs_mutex protects dbs_enable in governor start/stop */
> struct mutex mutex;
>
> >
> > Signed-off-by: Sergey Senozhatsky <[email protected]>
> >
> > ---
> >
> > drivers/cpufreq/cpufreq.c | 5 +----
> > drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
> > drivers/cpufreq/cpufreq_stats.c | 2 +-
> > 3 files changed, 13 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 6a015ad..f8aacf1 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> > case CPU_ONLINE:
> > cpufreq_add_dev(dev, NULL);
> > break;
> > - case CPU_DOWN_PREPARE:
> > + case CPU_POST_DEAD:
> > case CPU_UP_CANCELED_FROZEN:
> > __cpufreq_remove_dev(dev, NULL);
> > break;
> > - case CPU_DOWN_FAILED:
> > - cpufreq_add_dev(dev, NULL);
> > - break;
> > }
> > }
> > return NOTIFY_OK;
> > diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> > index 4645876..681d5d6 100644
> > --- a/drivers/cpufreq/cpufreq_governor.c
> > +++ b/drivers/cpufreq/cpufreq_governor.c
> > @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> > unsigned int delay)
> > {
> > struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
> > -
> > + /* cpu offline might block existing gov_queue_work() user,
> > + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
> > + * thus potentially we can hit offlined CPU */
> > + if (unlikely(cpu_is_offline(cpu)))
> > + return;
> > mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> > }
> >
> > @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> > unsigned int delay, bool all_cpus)
> > {
> > int i;
> > -
> > + get_online_cpus();
> > if (!all_cpus) {
> > __gov_queue_work(smp_processor_id(), dbs_data, delay);
> > } else {
> > - get_online_cpus();
> > for_each_cpu(i, policy->cpus)
> > __gov_queue_work(i, dbs_data, delay);
> > - put_online_cpus();
> > }
> > + put_online_cpus();
> > }
> > EXPORT_SYMBOL_GPL(gov_queue_work);
> >
> > @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> > /* Initiate timer time stamp */
> > cpu_cdbs->time_stamp = ktime_get();
> >
> > - gov_queue_work(dbs_data, policy,
> > - delay_for_sampling_rate(sampling_rate), true);
> > + /* hotplug lock already held */
> > + for_each_cpu(j, policy->cpus)
> > + __gov_queue_work(j, dbs_data,
> > + delay_for_sampling_rate(sampling_rate));
> > break;
> >
> > case CPUFREQ_GOV_STOP:
> > diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> > index cd9e817..833816e 100644
> > --- a/drivers/cpufreq/cpufreq_stats.c
> > +++ b/drivers/cpufreq/cpufreq_stats.c
> > @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> > case CPU_DOWN_PREPARE:
> > cpufreq_stats_free_sysfs(cpu);
> > break;
> > - case CPU_DEAD:
> > + case CPU_POST_DEAD:
> > cpufreq_stats_free_table(cpu);
> > break;
> > case CPU_UP_CANCELED_FROZEN:
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>

2013-07-11 08:47:44

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/11/2013 04:22 PM, Sergey Senozhatsky wrote:
[snip]
>>
>
> Hello Michael,
> nice job! works fine for me.
>
> Reported-and-Tested-by: Sergey Senozhatsky <[email protected]>

Thanks for the test :)

Borislav may also doing some testing, let's wait for few days and see
whether there are any point we missed.

And we should also thanks Srivatsa for catching the root issue ;-)

Regards,
Michael Wang

>
>
> -ss
>
>> Regards,
>> Michael Wang
>>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index dc9b72e..a64b544 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>> {
>> int i;
>>
>> + if (dbs_data->queue_stop)
>> + return;
>> +
>> if (!all_cpus) {
>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>> } else {
>> - get_online_cpus();
>> for_each_cpu(i, policy->cpus)
>> __gov_queue_work(i, dbs_data, delay);
>> - put_online_cpus();
>> }
>> }
>> EXPORT_SYMBOL_GPL(gov_queue_work);
>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>> struct cpufreq_policy *policy)
>> {
>> struct cpu_dbs_common_info *cdbs;
>> - int i;
>> + int i, round = 2;
>>
>> + dbs_data->queue_stop = 1;
>> +redo:
>> + round--;
>> for_each_cpu(i, policy->cpus) {
>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
>> cancel_delayed_work_sync(&cdbs->work);
>> }
>> +
>> + /*
>> + * Since there is no lock to prvent re-queue the
>> + * cancelled work, some early cancelled work might
>> + * have been queued again by later cancelled work.
>> + *
>> + * Flush the work again with dbs_data->queue_stop
>> + * enabled, this time there will be no survivors.
>> + */
>> + if (round)
>> + goto redo;
>> + dbs_data->queue_stop = 0;
>> }
>>
>> /* Will return if we need to evaluate cpu load again or not */
>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
>> index e16a961..9116135 100644
>> --- a/drivers/cpufreq/cpufreq_governor.h
>> +++ b/drivers/cpufreq/cpufreq_governor.h
>> @@ -213,6 +213,7 @@ struct dbs_data {
>> unsigned int min_sampling_rate;
>> int usage_count;
>> void *tuners;
>> + int queue_stop;
>>
>> /* dbs_mutex protects dbs_enable in governor start/stop */
>> struct mutex mutex;
>>
>>>
>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
>>>
>>> ---
>>>
>>> drivers/cpufreq/cpufreq.c | 5 +----
>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
>>> 3 files changed, 13 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index 6a015ad..f8aacf1 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
>>> case CPU_ONLINE:
>>> cpufreq_add_dev(dev, NULL);
>>> break;
>>> - case CPU_DOWN_PREPARE:
>>> + case CPU_POST_DEAD:
>>> case CPU_UP_CANCELED_FROZEN:
>>> __cpufreq_remove_dev(dev, NULL);
>>> break;
>>> - case CPU_DOWN_FAILED:
>>> - cpufreq_add_dev(dev, NULL);
>>> - break;
>>> }
>>> }
>>> return NOTIFY_OK;
>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>> index 4645876..681d5d6 100644
>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>>> unsigned int delay)
>>> {
>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>> -
>>> + /* cpu offline might block existing gov_queue_work() user,
>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
>>> + * thus potentially we can hit offlined CPU */
>>> + if (unlikely(cpu_is_offline(cpu)))
>>> + return;
>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>>> }
>>>
>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>> unsigned int delay, bool all_cpus)
>>> {
>>> int i;
>>> -
>>> + get_online_cpus();
>>> if (!all_cpus) {
>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>> } else {
>>> - get_online_cpus();
>>> for_each_cpu(i, policy->cpus)
>>> __gov_queue_work(i, dbs_data, delay);
>>> - put_online_cpus();
>>> }
>>> + put_online_cpus();
>>> }
>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>
>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>> /* Initiate timer time stamp */
>>> cpu_cdbs->time_stamp = ktime_get();
>>>
>>> - gov_queue_work(dbs_data, policy,
>>> - delay_for_sampling_rate(sampling_rate), true);
>>> + /* hotplug lock already held */
>>> + for_each_cpu(j, policy->cpus)
>>> + __gov_queue_work(j, dbs_data,
>>> + delay_for_sampling_rate(sampling_rate));
>>> break;
>>>
>>> case CPUFREQ_GOV_STOP:
>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>>> index cd9e817..833816e 100644
>>> --- a/drivers/cpufreq/cpufreq_stats.c
>>> +++ b/drivers/cpufreq/cpufreq_stats.c
>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
>>> case CPU_DOWN_PREPARE:
>>> cpufreq_stats_free_sysfs(cpu);
>>> break;
>>> - case CPU_DEAD:
>>> + case CPU_POST_DEAD:
>>> cpufreq_stats_free_table(cpu);
>>> break;
>>> case CPU_UP_CANCELED_FROZEN:
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at http://www.tux.org/lkml/
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-11 08:49:12

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/11/2013 04:47 PM, Michael Wang wrote:
> On 07/11/2013 04:22 PM, Sergey Senozhatsky wrote:
> [snip]
>>>
>>
>> Hello Michael,
>> nice job! works fine for me.
>>
>> Reported-and-Tested-by: Sergey Senozhatsky <[email protected]>
>
> Thanks for the test :)
>
> Borislav may also doing some testing, let's wait for few days and see
> whether there are any point we missed.

s /Borislav/Bartlomiej

>
> And we should also thanks Srivatsa for catching the root issue ;-)
>
> Regards,
> Michael Wang
>
>>
>>
>> -ss
>>
>>> Regards,
>>> Michael Wang
>>>
>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>> index dc9b72e..a64b544 100644
>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>> {
>>> int i;
>>>
>>> + if (dbs_data->queue_stop)
>>> + return;
>>> +
>>> if (!all_cpus) {
>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>> } else {
>>> - get_online_cpus();
>>> for_each_cpu(i, policy->cpus)
>>> __gov_queue_work(i, dbs_data, delay);
>>> - put_online_cpus();
>>> }
>>> }
>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>>> struct cpufreq_policy *policy)
>>> {
>>> struct cpu_dbs_common_info *cdbs;
>>> - int i;
>>> + int i, round = 2;
>>>
>>> + dbs_data->queue_stop = 1;
>>> +redo:
>>> + round--;
>>> for_each_cpu(i, policy->cpus) {
>>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
>>> cancel_delayed_work_sync(&cdbs->work);
>>> }
>>> +
>>> + /*
>>> + * Since there is no lock to prvent re-queue the
>>> + * cancelled work, some early cancelled work might
>>> + * have been queued again by later cancelled work.
>>> + *
>>> + * Flush the work again with dbs_data->queue_stop
>>> + * enabled, this time there will be no survivors.
>>> + */
>>> + if (round)
>>> + goto redo;
>>> + dbs_data->queue_stop = 0;
>>> }
>>>
>>> /* Will return if we need to evaluate cpu load again or not */
>>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
>>> index e16a961..9116135 100644
>>> --- a/drivers/cpufreq/cpufreq_governor.h
>>> +++ b/drivers/cpufreq/cpufreq_governor.h
>>> @@ -213,6 +213,7 @@ struct dbs_data {
>>> unsigned int min_sampling_rate;
>>> int usage_count;
>>> void *tuners;
>>> + int queue_stop;
>>>
>>> /* dbs_mutex protects dbs_enable in governor start/stop */
>>> struct mutex mutex;
>>>
>>>>
>>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
>>>>
>>>> ---
>>>>
>>>> drivers/cpufreq/cpufreq.c | 5 +----
>>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
>>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
>>>> 3 files changed, 13 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>>> index 6a015ad..f8aacf1 100644
>>>> --- a/drivers/cpufreq/cpufreq.c
>>>> +++ b/drivers/cpufreq/cpufreq.c
>>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
>>>> case CPU_ONLINE:
>>>> cpufreq_add_dev(dev, NULL);
>>>> break;
>>>> - case CPU_DOWN_PREPARE:
>>>> + case CPU_POST_DEAD:
>>>> case CPU_UP_CANCELED_FROZEN:
>>>> __cpufreq_remove_dev(dev, NULL);
>>>> break;
>>>> - case CPU_DOWN_FAILED:
>>>> - cpufreq_add_dev(dev, NULL);
>>>> - break;
>>>> }
>>>> }
>>>> return NOTIFY_OK;
>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>> index 4645876..681d5d6 100644
>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>>>> unsigned int delay)
>>>> {
>>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>>> -
>>>> + /* cpu offline might block existing gov_queue_work() user,
>>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
>>>> + * thus potentially we can hit offlined CPU */
>>>> + if (unlikely(cpu_is_offline(cpu)))
>>>> + return;
>>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>>>> }
>>>>
>>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>>> unsigned int delay, bool all_cpus)
>>>> {
>>>> int i;
>>>> -
>>>> + get_online_cpus();
>>>> if (!all_cpus) {
>>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>>> } else {
>>>> - get_online_cpus();
>>>> for_each_cpu(i, policy->cpus)
>>>> __gov_queue_work(i, dbs_data, delay);
>>>> - put_online_cpus();
>>>> }
>>>> + put_online_cpus();
>>>> }
>>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>>
>>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>> /* Initiate timer time stamp */
>>>> cpu_cdbs->time_stamp = ktime_get();
>>>>
>>>> - gov_queue_work(dbs_data, policy,
>>>> - delay_for_sampling_rate(sampling_rate), true);
>>>> + /* hotplug lock already held */
>>>> + for_each_cpu(j, policy->cpus)
>>>> + __gov_queue_work(j, dbs_data,
>>>> + delay_for_sampling_rate(sampling_rate));
>>>> break;
>>>>
>>>> case CPUFREQ_GOV_STOP:
>>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>>>> index cd9e817..833816e 100644
>>>> --- a/drivers/cpufreq/cpufreq_stats.c
>>>> +++ b/drivers/cpufreq/cpufreq_stats.c
>>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
>>>> case CPU_DOWN_PREPARE:
>>>> cpufreq_stats_free_sysfs(cpu);
>>>> break;
>>>> - case CPU_DEAD:
>>>> + case CPU_POST_DEAD:
>>>> cpufreq_stats_free_table(cpu);
>>>> break;
>>>> case CPU_UP_CANCELED_FROZEN:
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at http://www.tux.org/lkml/
>>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>

2013-07-11 09:01:45

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/11/13 16:47), Michael Wang wrote:
> On 07/11/2013 04:22 PM, Sergey Senozhatsky wrote:
> [snip]
> >>
> >
> > Hello Michael,
> > nice job! works fine for me.
> >
> > Reported-and-Tested-by: Sergey Senozhatsky <[email protected]>
>
> Thanks for the test :)
>
> Borislav may also doing some testing, let's wait for few days and see
> whether there are any point we missed.
>
> And we should also thanks Srivatsa for catching the root issue ;-)

Sure, many thanks to everyone.
I'll perform some additional testing.

-ss

> Regards,
> Michael Wang
>
> >
> >
> > -ss
> >
> >> Regards,
> >> Michael Wang
> >>
> >> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >> index dc9b72e..a64b544 100644
> >> --- a/drivers/cpufreq/cpufreq_governor.c
> >> +++ b/drivers/cpufreq/cpufreq_governor.c
> >> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> >> {
> >> int i;
> >>
> >> + if (dbs_data->queue_stop)
> >> + return;
> >> +
> >> if (!all_cpus) {
> >> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> >> } else {
> >> - get_online_cpus();
> >> for_each_cpu(i, policy->cpus)
> >> __gov_queue_work(i, dbs_data, delay);
> >> - put_online_cpus();
> >> }
> >> }
> >> EXPORT_SYMBOL_GPL(gov_queue_work);
> >> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> >> struct cpufreq_policy *policy)
> >> {
> >> struct cpu_dbs_common_info *cdbs;
> >> - int i;
> >> + int i, round = 2;
> >>
> >> + dbs_data->queue_stop = 1;
> >> +redo:
> >> + round--;
> >> for_each_cpu(i, policy->cpus) {
> >> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> >> cancel_delayed_work_sync(&cdbs->work);
> >> }
> >> +
> >> + /*
> >> + * Since there is no lock to prvent re-queue the
> >> + * cancelled work, some early cancelled work might
> >> + * have been queued again by later cancelled work.
> >> + *
> >> + * Flush the work again with dbs_data->queue_stop
> >> + * enabled, this time there will be no survivors.
> >> + */
> >> + if (round)
> >> + goto redo;
> >> + dbs_data->queue_stop = 0;
> >> }
> >>
> >> /* Will return if we need to evaluate cpu load again or not */
> >> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> >> index e16a961..9116135 100644
> >> --- a/drivers/cpufreq/cpufreq_governor.h
> >> +++ b/drivers/cpufreq/cpufreq_governor.h
> >> @@ -213,6 +213,7 @@ struct dbs_data {
> >> unsigned int min_sampling_rate;
> >> int usage_count;
> >> void *tuners;
> >> + int queue_stop;
> >>
> >> /* dbs_mutex protects dbs_enable in governor start/stop */
> >> struct mutex mutex;
> >>
> >>>
> >>> Signed-off-by: Sergey Senozhatsky <[email protected]>
> >>>
> >>> ---
> >>>
> >>> drivers/cpufreq/cpufreq.c | 5 +----
> >>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
> >>> drivers/cpufreq/cpufreq_stats.c | 2 +-
> >>> 3 files changed, 13 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> >>> index 6a015ad..f8aacf1 100644
> >>> --- a/drivers/cpufreq/cpufreq.c
> >>> +++ b/drivers/cpufreq/cpufreq.c
> >>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> >>> case CPU_ONLINE:
> >>> cpufreq_add_dev(dev, NULL);
> >>> break;
> >>> - case CPU_DOWN_PREPARE:
> >>> + case CPU_POST_DEAD:
> >>> case CPU_UP_CANCELED_FROZEN:
> >>> __cpufreq_remove_dev(dev, NULL);
> >>> break;
> >>> - case CPU_DOWN_FAILED:
> >>> - cpufreq_add_dev(dev, NULL);
> >>> - break;
> >>> }
> >>> }
> >>> return NOTIFY_OK;
> >>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >>> index 4645876..681d5d6 100644
> >>> --- a/drivers/cpufreq/cpufreq_governor.c
> >>> +++ b/drivers/cpufreq/cpufreq_governor.c
> >>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> >>> unsigned int delay)
> >>> {
> >>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
> >>> -
> >>> + /* cpu offline might block existing gov_queue_work() user,
> >>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
> >>> + * thus potentially we can hit offlined CPU */
> >>> + if (unlikely(cpu_is_offline(cpu)))
> >>> + return;
> >>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> >>> }
> >>>
> >>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> >>> unsigned int delay, bool all_cpus)
> >>> {
> >>> int i;
> >>> -
> >>> + get_online_cpus();
> >>> if (!all_cpus) {
> >>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> >>> } else {
> >>> - get_online_cpus();
> >>> for_each_cpu(i, policy->cpus)
> >>> __gov_queue_work(i, dbs_data, delay);
> >>> - put_online_cpus();
> >>> }
> >>> + put_online_cpus();
> >>> }
> >>> EXPORT_SYMBOL_GPL(gov_queue_work);
> >>>
> >>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> >>> /* Initiate timer time stamp */
> >>> cpu_cdbs->time_stamp = ktime_get();
> >>>
> >>> - gov_queue_work(dbs_data, policy,
> >>> - delay_for_sampling_rate(sampling_rate), true);
> >>> + /* hotplug lock already held */
> >>> + for_each_cpu(j, policy->cpus)
> >>> + __gov_queue_work(j, dbs_data,
> >>> + delay_for_sampling_rate(sampling_rate));
> >>> break;
> >>>
> >>> case CPUFREQ_GOV_STOP:
> >>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> >>> index cd9e817..833816e 100644
> >>> --- a/drivers/cpufreq/cpufreq_stats.c
> >>> +++ b/drivers/cpufreq/cpufreq_stats.c
> >>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> >>> case CPU_DOWN_PREPARE:
> >>> cpufreq_stats_free_sysfs(cpu);
> >>> break;
> >>> - case CPU_DEAD:
> >>> + case CPU_POST_DEAD:
> >>> cpufreq_stats_free_table(cpu);
> >>> break;
> >>> case CPU_UP_CANCELED_FROZEN:
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >>> the body of a message to [email protected]
> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>> Please read the FAQ at http://www.tux.org/lkml/
> >>>
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>

Subject: Re: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected


Hi,

On Thursday, July 11, 2013 04:48:51 PM Michael Wang wrote:
> On 07/11/2013 04:47 PM, Michael Wang wrote:
> > On 07/11/2013 04:22 PM, Sergey Senozhatsky wrote:
> > [snip]
> >>>
> >>
> >> Hello Michael,
> >> nice job! works fine for me.
> >>
> >> Reported-and-Tested-by: Sergey Senozhatsky <[email protected]>
> >
> > Thanks for the test :)
> >
> > Borislav may also doing some testing, let's wait for few days and see
> > whether there are any point we missed.
>
> s /Borislav/Bartlomiej

Michael's patch also works for me. Thanks to everyone involved!
(My only nitpick for the patch is that ->queue_stop can be made bool.)

Reported-and-Tested-by: Bartlomiej Zolnierkiewicz <[email protected]>

I think that it would also be helpful if Jiri or Borislav could test
the patch and see if it really works for them and fixes the original
warning they were experiencing on x86.

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

> > And we should also thanks Srivatsa for catching the root issue ;-)
> >
> > Regards,
> > Michael Wang
> >
> >>
> >>
> >> -ss
> >>
> >>> Regards,
> >>> Michael Wang
> >>>
> >>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >>> index dc9b72e..a64b544 100644
> >>> --- a/drivers/cpufreq/cpufreq_governor.c
> >>> +++ b/drivers/cpufreq/cpufreq_governor.c
> >>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> >>> {
> >>> int i;
> >>>
> >>> + if (dbs_data->queue_stop)
> >>> + return;
> >>> +
> >>> if (!all_cpus) {
> >>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> >>> } else {
> >>> - get_online_cpus();
> >>> for_each_cpu(i, policy->cpus)
> >>> __gov_queue_work(i, dbs_data, delay);
> >>> - put_online_cpus();
> >>> }
> >>> }
> >>> EXPORT_SYMBOL_GPL(gov_queue_work);
> >>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> >>> struct cpufreq_policy *policy)
> >>> {
> >>> struct cpu_dbs_common_info *cdbs;
> >>> - int i;
> >>> + int i, round = 2;
> >>>
> >>> + dbs_data->queue_stop = 1;
> >>> +redo:
> >>> + round--;
> >>> for_each_cpu(i, policy->cpus) {
> >>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> >>> cancel_delayed_work_sync(&cdbs->work);
> >>> }
> >>> +
> >>> + /*
> >>> + * Since there is no lock to prvent re-queue the
> >>> + * cancelled work, some early cancelled work might
> >>> + * have been queued again by later cancelled work.
> >>> + *
> >>> + * Flush the work again with dbs_data->queue_stop
> >>> + * enabled, this time there will be no survivors.
> >>> + */
> >>> + if (round)
> >>> + goto redo;
> >>> + dbs_data->queue_stop = 0;
> >>> }
> >>>
> >>> /* Will return if we need to evaluate cpu load again or not */
> >>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> >>> index e16a961..9116135 100644
> >>> --- a/drivers/cpufreq/cpufreq_governor.h
> >>> +++ b/drivers/cpufreq/cpufreq_governor.h
> >>> @@ -213,6 +213,7 @@ struct dbs_data {
> >>> unsigned int min_sampling_rate;
> >>> int usage_count;
> >>> void *tuners;
> >>> + int queue_stop;
> >>>
> >>> /* dbs_mutex protects dbs_enable in governor start/stop */
> >>> struct mutex mutex;
> >>>
> >>>>
> >>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
> >>>>
> >>>> ---
> >>>>
> >>>> drivers/cpufreq/cpufreq.c | 5 +----
> >>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
> >>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
> >>>> 3 files changed, 13 insertions(+), 11 deletions(-)
> >>>>
> >>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> >>>> index 6a015ad..f8aacf1 100644
> >>>> --- a/drivers/cpufreq/cpufreq.c
> >>>> +++ b/drivers/cpufreq/cpufreq.c
> >>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> >>>> case CPU_ONLINE:
> >>>> cpufreq_add_dev(dev, NULL);
> >>>> break;
> >>>> - case CPU_DOWN_PREPARE:
> >>>> + case CPU_POST_DEAD:
> >>>> case CPU_UP_CANCELED_FROZEN:
> >>>> __cpufreq_remove_dev(dev, NULL);
> >>>> break;
> >>>> - case CPU_DOWN_FAILED:
> >>>> - cpufreq_add_dev(dev, NULL);
> >>>> - break;
> >>>> }
> >>>> }
> >>>> return NOTIFY_OK;
> >>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >>>> index 4645876..681d5d6 100644
> >>>> --- a/drivers/cpufreq/cpufreq_governor.c
> >>>> +++ b/drivers/cpufreq/cpufreq_governor.c
> >>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> >>>> unsigned int delay)
> >>>> {
> >>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
> >>>> -
> >>>> + /* cpu offline might block existing gov_queue_work() user,
> >>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
> >>>> + * thus potentially we can hit offlined CPU */
> >>>> + if (unlikely(cpu_is_offline(cpu)))
> >>>> + return;
> >>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> >>>> }
> >>>>
> >>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> >>>> unsigned int delay, bool all_cpus)
> >>>> {
> >>>> int i;
> >>>> -
> >>>> + get_online_cpus();
> >>>> if (!all_cpus) {
> >>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> >>>> } else {
> >>>> - get_online_cpus();
> >>>> for_each_cpu(i, policy->cpus)
> >>>> __gov_queue_work(i, dbs_data, delay);
> >>>> - put_online_cpus();
> >>>> }
> >>>> + put_online_cpus();
> >>>> }
> >>>> EXPORT_SYMBOL_GPL(gov_queue_work);
> >>>>
> >>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> >>>> /* Initiate timer time stamp */
> >>>> cpu_cdbs->time_stamp = ktime_get();
> >>>>
> >>>> - gov_queue_work(dbs_data, policy,
> >>>> - delay_for_sampling_rate(sampling_rate), true);
> >>>> + /* hotplug lock already held */
> >>>> + for_each_cpu(j, policy->cpus)
> >>>> + __gov_queue_work(j, dbs_data,
> >>>> + delay_for_sampling_rate(sampling_rate));
> >>>> break;
> >>>>
> >>>> case CPUFREQ_GOV_STOP:
> >>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> >>>> index cd9e817..833816e 100644
> >>>> --- a/drivers/cpufreq/cpufreq_stats.c
> >>>> +++ b/drivers/cpufreq/cpufreq_stats.c
> >>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> >>>> case CPU_DOWN_PREPARE:
> >>>> cpufreq_stats_free_sysfs(cpu);
> >>>> break;
> >>>> - case CPU_DEAD:
> >>>> + case CPU_POST_DEAD:
> >>>> cpufreq_stats_free_table(cpu);
> >>>> break;
> >>>> case CPU_UP_CANCELED_FROZEN:
> >>>> --

2013-07-12 02:20:11

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/11/2013 07:47 PM, Bartlomiej Zolnierkiewicz wrote:
[snip]
>
> Michael's patch also works for me. Thanks to everyone involved!
> (My only nitpick for the patch is that ->queue_stop can be made bool.)
>
> Reported-and-Tested-by: Bartlomiej Zolnierkiewicz <[email protected]>
>
> I think that it would also be helpful if Jiri or Borislav could test
> the patch and see if it really works for them and fixes the original
> warning they were experiencing on x86.

Thanks for the testing :)

I plan to send out the formal patch next week, so Jiri and Borislav
would have chance to join the discussion.

Regards,
Michael Wang

>
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R&D Institute Poland
> Samsung Electronics
>
>>> And we should also thanks Srivatsa for catching the root issue ;-)
>>>
>>> Regards,
>>> Michael Wang
>>>
>>>>
>>>>
>>>> -ss
>>>>
>>>>> Regards,
>>>>> Michael Wang
>>>>>
>>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>>> index dc9b72e..a64b544 100644
>>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>>>> {
>>>>> int i;
>>>>>
>>>>> + if (dbs_data->queue_stop)
>>>>> + return;
>>>>> +
>>>>> if (!all_cpus) {
>>>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>>>> } else {
>>>>> - get_online_cpus();
>>>>> for_each_cpu(i, policy->cpus)
>>>>> __gov_queue_work(i, dbs_data, delay);
>>>>> - put_online_cpus();
>>>>> }
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>>>>> struct cpufreq_policy *policy)
>>>>> {
>>>>> struct cpu_dbs_common_info *cdbs;
>>>>> - int i;
>>>>> + int i, round = 2;
>>>>>
>>>>> + dbs_data->queue_stop = 1;
>>>>> +redo:
>>>>> + round--;
>>>>> for_each_cpu(i, policy->cpus) {
>>>>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
>>>>> cancel_delayed_work_sync(&cdbs->work);
>>>>> }
>>>>> +
>>>>> + /*
>>>>> + * Since there is no lock to prvent re-queue the
>>>>> + * cancelled work, some early cancelled work might
>>>>> + * have been queued again by later cancelled work.
>>>>> + *
>>>>> + * Flush the work again with dbs_data->queue_stop
>>>>> + * enabled, this time there will be no survivors.
>>>>> + */
>>>>> + if (round)
>>>>> + goto redo;
>>>>> + dbs_data->queue_stop = 0;
>>>>> }
>>>>>
>>>>> /* Will return if we need to evaluate cpu load again or not */
>>>>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
>>>>> index e16a961..9116135 100644
>>>>> --- a/drivers/cpufreq/cpufreq_governor.h
>>>>> +++ b/drivers/cpufreq/cpufreq_governor.h
>>>>> @@ -213,6 +213,7 @@ struct dbs_data {
>>>>> unsigned int min_sampling_rate;
>>>>> int usage_count;
>>>>> void *tuners;
>>>>> + int queue_stop;
>>>>>
>>>>> /* dbs_mutex protects dbs_enable in governor start/stop */
>>>>> struct mutex mutex;
>>>>>
>>>>>>
>>>>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> drivers/cpufreq/cpufreq.c | 5 +----
>>>>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
>>>>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
>>>>>> 3 files changed, 13 insertions(+), 11 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>>>>> index 6a015ad..f8aacf1 100644
>>>>>> --- a/drivers/cpufreq/cpufreq.c
>>>>>> +++ b/drivers/cpufreq/cpufreq.c
>>>>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
>>>>>> case CPU_ONLINE:
>>>>>> cpufreq_add_dev(dev, NULL);
>>>>>> break;
>>>>>> - case CPU_DOWN_PREPARE:
>>>>>> + case CPU_POST_DEAD:
>>>>>> case CPU_UP_CANCELED_FROZEN:
>>>>>> __cpufreq_remove_dev(dev, NULL);
>>>>>> break;
>>>>>> - case CPU_DOWN_FAILED:
>>>>>> - cpufreq_add_dev(dev, NULL);
>>>>>> - break;
>>>>>> }
>>>>>> }
>>>>>> return NOTIFY_OK;
>>>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>>>> index 4645876..681d5d6 100644
>>>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>>>>>> unsigned int delay)
>>>>>> {
>>>>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>>>>> -
>>>>>> + /* cpu offline might block existing gov_queue_work() user,
>>>>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
>>>>>> + * thus potentially we can hit offlined CPU */
>>>>>> + if (unlikely(cpu_is_offline(cpu)))
>>>>>> + return;
>>>>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>>>>>> }
>>>>>>
>>>>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>>>>> unsigned int delay, bool all_cpus)
>>>>>> {
>>>>>> int i;
>>>>>> -
>>>>>> + get_online_cpus();
>>>>>> if (!all_cpus) {
>>>>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>>>>> } else {
>>>>>> - get_online_cpus();
>>>>>> for_each_cpu(i, policy->cpus)
>>>>>> __gov_queue_work(i, dbs_data, delay);
>>>>>> - put_online_cpus();
>>>>>> }
>>>>>> + put_online_cpus();
>>>>>> }
>>>>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>>>>
>>>>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>>>> /* Initiate timer time stamp */
>>>>>> cpu_cdbs->time_stamp = ktime_get();
>>>>>>
>>>>>> - gov_queue_work(dbs_data, policy,
>>>>>> - delay_for_sampling_rate(sampling_rate), true);
>>>>>> + /* hotplug lock already held */
>>>>>> + for_each_cpu(j, policy->cpus)
>>>>>> + __gov_queue_work(j, dbs_data,
>>>>>> + delay_for_sampling_rate(sampling_rate));
>>>>>> break;
>>>>>>
>>>>>> case CPUFREQ_GOV_STOP:
>>>>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>>>>>> index cd9e817..833816e 100644
>>>>>> --- a/drivers/cpufreq/cpufreq_stats.c
>>>>>> +++ b/drivers/cpufreq/cpufreq_stats.c
>>>>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
>>>>>> case CPU_DOWN_PREPARE:
>>>>>> cpufreq_stats_free_sysfs(cpu);
>>>>>> break;
>>>>>> - case CPU_DEAD:
>>>>>> + case CPU_POST_DEAD:
>>>>>> cpufreq_stats_free_table(cpu);
>>>>>> break;
>>>>>> case CPU_UP_CANCELED_FROZEN:
>>>>>> --
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-14 11:48:32

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/11/13 10:43), Michael Wang wrote:
> [..]
> Nice to know you have some idea on solving the issue ;-)
>
> I'm not sure whether I catch the idea, but seems like you are trying
> to re-organize the timing of add/remove device.
>
> I'm sure that we have more than one way to solve the issues, but what
> we need is the cure of root...
>
> As Srivatsa discovered, the root issue may be:
> gov_cancel_work() failed to stop all the work after it's return.
>
> And Viresh also confirmed that this is not by-designed.
>
> Which means gov_queue_work() invoked by od_dbs_timer() is supposed to
> never happen after CPUFREQ_GOV_STOP notify, the whole policy should
> stop working at that time.
>
> But it failed to, and the work concurrent with cpu dying caused the
> first problem.
>
> Thus I think we should focus on this and suggested below fix, I'd like
> to know your opinions :)
>

Hello,

I just realized that lockdep was disabling itself at startup (after recent AMD
radeon patch set) due to radeon kms error:

[ 4.790019] [drm] Loading CEDAR Microcode
[ 4.790943] r600_cp: Failed to load firmware "radeon/CEDAR_smc.bin"
[ 4.791152] [drm:evergreen_startup] *ERROR* Failed to load firmware!
[ 4.791330] radeon 0000:01:00.0: disabling GPU acceleration
[ 4.792633] INFO: trying to register non-static key.
[ 4.792792] the code is fine but needs lockdep annotation.
[ 4.792953] turning off the locking correctness validator.


Now, as I fixed radeon kms, I can see:

[ 806.660530] ------------[ cut here ]------------
[ 806.660539] WARNING: CPU: 0 PID: 2389 at arch/x86/kernel/smp.c:124
native_smp_send_reschedule+0x57/0x60()

[ 806.660572] Workqueue: events od_dbs_timer
[ 806.660575] 0000000000000009 ffff8801531cfbd8 ffffffff816044ee
0000000000000000
[ 806.660577] ffff8801531cfc10 ffffffff8104995d 0000000000000003
ffff8801531f8000
[ 806.660579] 000000010001ee39 0000000000000003 0000000000000003
ffff8801531cfc20
[ 806.660580] Call Trace:
[ 806.660587] [<ffffffff816044ee>] dump_stack+0x4e/0x82
[ 806.660591] [<ffffffff8104995d>] warn_slowpath_common+0x7d/0xa0
[ 806.660593] [<ffffffff81049a3a>] warn_slowpath_null+0x1a/0x20
[ 806.660595] [<ffffffff8102ca07>] native_smp_send_reschedule+0x57/0x60
[ 806.660599] [<ffffffff81085211>] wake_up_nohz_cpu+0x61/0xb0
[ 806.660603] [<ffffffff8105cb6d>] add_timer_on+0x8d/0x1e0
[ 806.660607] [<ffffffff8106cc66>] __queue_delayed_work+0x166/0x1a0
[ 806.660609] [<ffffffff8106d6a9>] ? try_to_grab_pending+0xd9/0x1a0
[ 806.660611] [<ffffffff8106d7bf>] mod_delayed_work_on+0x4f/0x90
[ 806.660613] [<ffffffff8150f436>] gov_queue_work+0x56/0xd0
[ 806.660615] [<ffffffff8150e740>] od_dbs_timer+0xc0/0x160
[ 806.660617] [<ffffffff8106dbcd>] process_one_work+0x1cd/0x6a0
[ 806.660619] [<ffffffff8106db63>] ? process_one_work+0x163/0x6a0
[ 806.660622] [<ffffffff8106e8d1>] worker_thread+0x121/0x3a0
[ 806.660627] [<ffffffff810b668d>] ? trace_hardirqs_on+0xd/0x10
[ 806.660629] [<ffffffff8106e7b0>] ? manage_workers.isra.24+0x2a0/0x2a0
[ 806.660633] [<ffffffff810760cb>] kthread+0xdb/0xe0
[ 806.660635] [<ffffffff81075ff0>] ? insert_kthread_work+0x70/0x70
[ 806.660639] [<ffffffff8160de6c>] ret_from_fork+0x7c/0xb0
[ 806.660640] [<ffffffff81075ff0>] ? insert_kthread_work+0x70/0x70
[ 806.660642] ---[ end trace 01ae278488a0ad6d ]---


The same problem why get/put_online_cpus() has been added to __gov_queue_work()

commit 2f7021a815f20f3481c10884fe9735ce2a56db35
Author: Michael Wang

cpufreq: protect 'policy->cpus' from offlining during
__gov_queue_work()

-ss

> Regards,
> Michael Wang
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index dc9b72e..a64b544 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> {
> int i;
>
> + if (dbs_data->queue_stop)
> + return;
> +
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> struct cpufreq_policy *policy)
> {
> struct cpu_dbs_common_info *cdbs;
> - int i;
> + int i, round = 2;
>
> + dbs_data->queue_stop = 1;
> +redo:
> + round--;
> for_each_cpu(i, policy->cpus) {
> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> cancel_delayed_work_sync(&cdbs->work);
> }
> +
> + /*
> + * Since there is no lock to prvent re-queue the
> + * cancelled work, some early cancelled work might
> + * have been queued again by later cancelled work.
> + *
> + * Flush the work again with dbs_data->queue_stop
> + * enabled, this time there will be no survivors.
> + */
> + if (round)
> + goto redo;
> + dbs_data->queue_stop = 0;
> }
>
> /* Will return if we need to evaluate cpu load again or not */
> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index e16a961..9116135 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -213,6 +213,7 @@ struct dbs_data {
> unsigned int min_sampling_rate;
> int usage_count;
> void *tuners;
> + int queue_stop;
>
> /* dbs_mutex protects dbs_enable in governor start/stop */
> struct mutex mutex;
>
> >
> > Signed-off-by: Sergey Senozhatsky <[email protected]>
> >
> > ---
> >
> > drivers/cpufreq/cpufreq.c | 5 +----
> > drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
> > drivers/cpufreq/cpufreq_stats.c | 2 +-
> > 3 files changed, 13 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 6a015ad..f8aacf1 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> > case CPU_ONLINE:
> > cpufreq_add_dev(dev, NULL);
> > break;
> > - case CPU_DOWN_PREPARE:
> > + case CPU_POST_DEAD:
> > case CPU_UP_CANCELED_FROZEN:
> > __cpufreq_remove_dev(dev, NULL);
> > break;
> > - case CPU_DOWN_FAILED:
> > - cpufreq_add_dev(dev, NULL);
> > - break;
> > }
> > }
> > return NOTIFY_OK;
> > diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> > index 4645876..681d5d6 100644
> > --- a/drivers/cpufreq/cpufreq_governor.c
> > +++ b/drivers/cpufreq/cpufreq_governor.c
> > @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> > unsigned int delay)
> > {
> > struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
> > -
> > + /* cpu offline might block existing gov_queue_work() user,
> > + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
> > + * thus potentially we can hit offlined CPU */
> > + if (unlikely(cpu_is_offline(cpu)))
> > + return;
> > mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> > }
> >
> > @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> > unsigned int delay, bool all_cpus)
> > {
> > int i;
> > -
> > + get_online_cpus();
> > if (!all_cpus) {
> > __gov_queue_work(smp_processor_id(), dbs_data, delay);
> > } else {
> > - get_online_cpus();
> > for_each_cpu(i, policy->cpus)
> > __gov_queue_work(i, dbs_data, delay);
> > - put_online_cpus();
> > }
> > + put_online_cpus();
> > }
> > EXPORT_SYMBOL_GPL(gov_queue_work);
> >
> > @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> > /* Initiate timer time stamp */
> > cpu_cdbs->time_stamp = ktime_get();
> >
> > - gov_queue_work(dbs_data, policy,
> > - delay_for_sampling_rate(sampling_rate), true);
> > + /* hotplug lock already held */
> > + for_each_cpu(j, policy->cpus)
> > + __gov_queue_work(j, dbs_data,
> > + delay_for_sampling_rate(sampling_rate));
> > break;
> >
> > case CPUFREQ_GOV_STOP:
> > diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> > index cd9e817..833816e 100644
> > --- a/drivers/cpufreq/cpufreq_stats.c
> > +++ b/drivers/cpufreq/cpufreq_stats.c
> > @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> > case CPU_DOWN_PREPARE:
> > cpufreq_stats_free_sysfs(cpu);
> > break;
> > - case CPU_DEAD:
> > + case CPU_POST_DEAD:
> > cpufreq_stats_free_table(cpu);
> > break;
> > case CPU_UP_CANCELED_FROZEN:
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>

2013-07-14 12:07:14

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/14/13 14:47), Sergey Senozhatsky wrote:
>
> Now, as I fixed radeon kms, I can see:
>
> [ 806.660530] ------------[ cut here ]------------
> [ 806.660539] WARNING: CPU: 0 PID: 2389 at arch/x86/kernel/smp.c:124
> native_smp_send_reschedule+0x57/0x60()

Well, this one is obviously not a lockdep error, I meant that previous
tests with disabled lockdep were invalid. Will re-do.

-ss

> > Regards,
> > Michael Wang
> >
> > diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> > index dc9b72e..a64b544 100644
> > --- a/drivers/cpufreq/cpufreq_governor.c
> > +++ b/drivers/cpufreq/cpufreq_governor.c
> > @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> > {
> > int i;
> >
> > + if (dbs_data->queue_stop)
> > + return;
> > +
> > if (!all_cpus) {
> > __gov_queue_work(smp_processor_id(), dbs_data, delay);
> > } else {
> > - get_online_cpus();
> > for_each_cpu(i, policy->cpus)
> > __gov_queue_work(i, dbs_data, delay);
> > - put_online_cpus();
> > }
> > }
> > EXPORT_SYMBOL_GPL(gov_queue_work);
> > @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> > struct cpufreq_policy *policy)
> > {
> > struct cpu_dbs_common_info *cdbs;
> > - int i;
> > + int i, round = 2;
> >
> > + dbs_data->queue_stop = 1;
> > +redo:
> > + round--;
> > for_each_cpu(i, policy->cpus) {
> > cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> > cancel_delayed_work_sync(&cdbs->work);
> > }
> > +
> > + /*
> > + * Since there is no lock to prvent re-queue the
> > + * cancelled work, some early cancelled work might
> > + * have been queued again by later cancelled work.
> > + *
> > + * Flush the work again with dbs_data->queue_stop
> > + * enabled, this time there will be no survivors.
> > + */
> > + if (round)
> > + goto redo;
> > + dbs_data->queue_stop = 0;
> > }
> >
> > /* Will return if we need to evaluate cpu load again or not */
> > diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> > index e16a961..9116135 100644
> > --- a/drivers/cpufreq/cpufreq_governor.h
> > +++ b/drivers/cpufreq/cpufreq_governor.h
> > @@ -213,6 +213,7 @@ struct dbs_data {
> > unsigned int min_sampling_rate;
> > int usage_count;
> > void *tuners;
> > + int queue_stop;
> >
> > /* dbs_mutex protects dbs_enable in governor start/stop */
> > struct mutex mutex;
> >
> > >
> > > Signed-off-by: Sergey Senozhatsky <[email protected]>
> > >
> > > ---
> > >
> > > drivers/cpufreq/cpufreq.c | 5 +----
> > > drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
> > > drivers/cpufreq/cpufreq_stats.c | 2 +-
> > > 3 files changed, 13 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > > index 6a015ad..f8aacf1 100644
> > > --- a/drivers/cpufreq/cpufreq.c
> > > +++ b/drivers/cpufreq/cpufreq.c
> > > @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> > > case CPU_ONLINE:
> > > cpufreq_add_dev(dev, NULL);
> > > break;
> > > - case CPU_DOWN_PREPARE:
> > > + case CPU_POST_DEAD:
> > > case CPU_UP_CANCELED_FROZEN:
> > > __cpufreq_remove_dev(dev, NULL);
> > > break;
> > > - case CPU_DOWN_FAILED:
> > > - cpufreq_add_dev(dev, NULL);
> > > - break;
> > > }
> > > }
> > > return NOTIFY_OK;
> > > diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> > > index 4645876..681d5d6 100644
> > > --- a/drivers/cpufreq/cpufreq_governor.c
> > > +++ b/drivers/cpufreq/cpufreq_governor.c
> > > @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> > > unsigned int delay)
> > > {
> > > struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
> > > -
> > > + /* cpu offline might block existing gov_queue_work() user,
> > > + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
> > > + * thus potentially we can hit offlined CPU */
> > > + if (unlikely(cpu_is_offline(cpu)))
> > > + return;
> > > mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> > > }
> > >
> > > @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> > > unsigned int delay, bool all_cpus)
> > > {
> > > int i;
> > > -
> > > + get_online_cpus();
> > > if (!all_cpus) {
> > > __gov_queue_work(smp_processor_id(), dbs_data, delay);
> > > } else {
> > > - get_online_cpus();
> > > for_each_cpu(i, policy->cpus)
> > > __gov_queue_work(i, dbs_data, delay);
> > > - put_online_cpus();
> > > }
> > > + put_online_cpus();
> > > }
> > > EXPORT_SYMBOL_GPL(gov_queue_work);
> > >
> > > @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> > > /* Initiate timer time stamp */
> > > cpu_cdbs->time_stamp = ktime_get();
> > >
> > > - gov_queue_work(dbs_data, policy,
> > > - delay_for_sampling_rate(sampling_rate), true);
> > > + /* hotplug lock already held */
> > > + for_each_cpu(j, policy->cpus)
> > > + __gov_queue_work(j, dbs_data,
> > > + delay_for_sampling_rate(sampling_rate));
> > > break;
> > >
> > > case CPUFREQ_GOV_STOP:
> > > diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> > > index cd9e817..833816e 100644
> > > --- a/drivers/cpufreq/cpufreq_stats.c
> > > +++ b/drivers/cpufreq/cpufreq_stats.c
> > > @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> > > case CPU_DOWN_PREPARE:
> > > cpufreq_stats_free_sysfs(cpu);
> > > break;
> > > - case CPU_DEAD:
> > > + case CPU_POST_DEAD:
> > > cpufreq_stats_free_table(cpu);
> > > break;
> > > case CPU_UP_CANCELED_FROZEN:
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
> > >
> >

2013-07-14 15:46:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On Thursday, July 11, 2013 10:43:45 AM Michael Wang wrote:
> Hi, Sergey
>
> On 07/11/2013 07:13 AM, Sergey Senozhatsky wrote:
> [snip]
> >
> >
> > Please kindly review the following patch.
> >
> >
> >
> > Remove cpu device only upon succesful cpu down on CPU_POST_DEAD event,
> > so we can kill off CPU_DOWN_FAILED case and eliminate potential extra
> > remove/add path:
> >
> > hotplug lock
> > CPU_DOWN_PREPARE: __cpufreq_remove_dev
> > CPU_DOWN_FAILED: cpufreq_add_dev
> > hotplug unlock
> >
> > Since cpu still present on CPU_DEAD event, cpu stats table should be
> > kept longer and removed later on CPU_POST_DEAD as well.
> >
> > Because CPU_POST_DEAD action performed with hotplug lock released, CPU_DOWN
> > might block existing gov_queue_work() user (blocked on get_online_cpus())
> > and unblock it with one of policy->cpus offlined, thus cpu_is_offline()
> > check is performed in __gov_queue_work().
> >
> > Besides, existing gov_queue_work() hotplug guard extended to protect all
> > __gov_queue_work() calls: for both all_cpus and !all_cpus cases.
> >
> > CPUFREQ_GOV_START performs direct __gov_queue_work() call because hotplug
> > lock already held there, opposing to previous gov_queue_work() and nested
> > get/put_online_cpus().
>
> Nice to know you have some idea on solving the issue ;-)
>
> I'm not sure whether I catch the idea, but seems like you are trying
> to re-organize the timing of add/remove device.
>
> I'm sure that we have more than one way to solve the issues, but what
> we need is the cure of root...
>
> As Srivatsa discovered, the root issue may be:
> gov_cancel_work() failed to stop all the work after it's return.
>
> And Viresh also confirmed that this is not by-designed.
>
> Which means gov_queue_work() invoked by od_dbs_timer() is supposed to
> never happen after CPUFREQ_GOV_STOP notify, the whole policy should
> stop working at that time.
>
> But it failed to, and the work concurrent with cpu dying caused the
> first problem.
>
> Thus I think we should focus on this and suggested below fix, I'd like
> to know your opinions :)
>
> Regards,
> Michael Wang
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index dc9b72e..a64b544 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> {
> int i;
>
> + if (dbs_data->queue_stop)
> + return;
> +
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> struct cpufreq_policy *policy)
> {
> struct cpu_dbs_common_info *cdbs;
> - int i;
> + int i, round = 2;
>
> + dbs_data->queue_stop = 1;
> +redo:
> + round--;
> for_each_cpu(i, policy->cpus) {
> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> cancel_delayed_work_sync(&cdbs->work);
> }
> +
> + /*
> + * Since there is no lock to prvent re-queue the
> + * cancelled work, some early cancelled work might
> + * have been queued again by later cancelled work.
> + *
> + * Flush the work again with dbs_data->queue_stop
> + * enabled, this time there will be no survivors.
> + */
> + if (round)
> + goto redo;

Well, what about doing:

for (round = 2; round; round--)
for_each_cpu(i, policy->cpus) {
cdbs = dbs_data->cdata->get_cpu_cdbs(i);
cancel_delayed_work_sync(&cdbs->work);
}

instead?

> + dbs_data->queue_stop = 0;
> }
>
> /* Will return if we need to evaluate cpu load again or not */
> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index e16a961..9116135 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -213,6 +213,7 @@ struct dbs_data {
> unsigned int min_sampling_rate;
> int usage_count;
> void *tuners;
> + int queue_stop;
>
> /* dbs_mutex protects dbs_enable in governor start/stop */
> struct mutex mutex;
>


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-07-15 02:43:16

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/14/2013 07:47 PM, Sergey Senozhatsky wrote:
[snip]
>
> Hello,
>
> I just realized that lockdep was disabling itself at startup (after recent AMD
> radeon patch set) due to radeon kms error:
>
> [ 4.790019] [drm] Loading CEDAR Microcode
> [ 4.790943] r600_cp: Failed to load firmware "radeon/CEDAR_smc.bin"
> [ 4.791152] [drm:evergreen_startup] *ERROR* Failed to load firmware!
> [ 4.791330] radeon 0000:01:00.0: disabling GPU acceleration
> [ 4.792633] INFO: trying to register non-static key.
> [ 4.792792] the code is fine but needs lockdep annotation.
> [ 4.792953] turning off the locking correctness validator.
>
>
> Now, as I fixed radeon kms, I can see:
>
> [ 806.660530] ------------[ cut here ]------------
> [ 806.660539] WARNING: CPU: 0 PID: 2389 at arch/x86/kernel/smp.c:124
> native_smp_send_reschedule+0x57/0x60()
>
> [ 806.660572] Workqueue: events od_dbs_timer
> [ 806.660575] 0000000000000009 ffff8801531cfbd8 ffffffff816044ee
> 0000000000000000
> [ 806.660577] ffff8801531cfc10 ffffffff8104995d 0000000000000003
> ffff8801531f8000
> [ 806.660579] 000000010001ee39 0000000000000003 0000000000000003
> ffff8801531cfc20
> [ 806.660580] Call Trace:
> [ 806.660587] [<ffffffff816044ee>] dump_stack+0x4e/0x82
> [ 806.660591] [<ffffffff8104995d>] warn_slowpath_common+0x7d/0xa0
> [ 806.660593] [<ffffffff81049a3a>] warn_slowpath_null+0x1a/0x20
> [ 806.660595] [<ffffffff8102ca07>] native_smp_send_reschedule+0x57/0x60
> [ 806.660599] [<ffffffff81085211>] wake_up_nohz_cpu+0x61/0xb0
> [ 806.660603] [<ffffffff8105cb6d>] add_timer_on+0x8d/0x1e0
> [ 806.660607] [<ffffffff8106cc66>] __queue_delayed_work+0x166/0x1a0
> [ 806.660609] [<ffffffff8106d6a9>] ? try_to_grab_pending+0xd9/0x1a0
> [ 806.660611] [<ffffffff8106d7bf>] mod_delayed_work_on+0x4f/0x90
> [ 806.660613] [<ffffffff8150f436>] gov_queue_work+0x56/0xd0
> [ 806.660615] [<ffffffff8150e740>] od_dbs_timer+0xc0/0x160
> [ 806.660617] [<ffffffff8106dbcd>] process_one_work+0x1cd/0x6a0
> [ 806.660619] [<ffffffff8106db63>] ? process_one_work+0x163/0x6a0
> [ 806.660622] [<ffffffff8106e8d1>] worker_thread+0x121/0x3a0
> [ 806.660627] [<ffffffff810b668d>] ? trace_hardirqs_on+0xd/0x10
> [ 806.660629] [<ffffffff8106e7b0>] ? manage_workers.isra.24+0x2a0/0x2a0
> [ 806.660633] [<ffffffff810760cb>] kthread+0xdb/0xe0
> [ 806.660635] [<ffffffff81075ff0>] ? insert_kthread_work+0x70/0x70
> [ 806.660639] [<ffffffff8160de6c>] ret_from_fork+0x7c/0xb0
> [ 806.660640] [<ffffffff81075ff0>] ? insert_kthread_work+0x70/0x70
> [ 806.660642] ---[ end trace 01ae278488a0ad6d ]---

So it back again...

Currently I have some assumptions in my mind:
1. we still failed to stop od_dbs_timer after STOP notify.
2. there is some code else which restart the work after STOP notify.
3. policy->cpus is broken.

I think we need a more detail investigation this time, let's catch the
mouse out ;-)

Regards,
Michael Wang

>
>
> The same problem why get/put_online_cpus() has been added to __gov_queue_work()
>
> commit 2f7021a815f20f3481c10884fe9735ce2a56db35
> Author: Michael Wang
>
> cpufreq: protect 'policy->cpus' from offlining during
> __gov_queue_work()
>
> -ss
>
>> Regards,
>> Michael Wang
>>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index dc9b72e..a64b544 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>> {
>> int i;
>>
>> + if (dbs_data->queue_stop)
>> + return;
>> +
>> if (!all_cpus) {
>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>> } else {
>> - get_online_cpus();
>> for_each_cpu(i, policy->cpus)
>> __gov_queue_work(i, dbs_data, delay);
>> - put_online_cpus();
>> }
>> }
>> EXPORT_SYMBOL_GPL(gov_queue_work);
>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>> struct cpufreq_policy *policy)
>> {
>> struct cpu_dbs_common_info *cdbs;
>> - int i;
>> + int i, round = 2;
>>
>> + dbs_data->queue_stop = 1;
>> +redo:
>> + round--;
>> for_each_cpu(i, policy->cpus) {
>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
>> cancel_delayed_work_sync(&cdbs->work);
>> }
>> +
>> + /*
>> + * Since there is no lock to prvent re-queue the
>> + * cancelled work, some early cancelled work might
>> + * have been queued again by later cancelled work.
>> + *
>> + * Flush the work again with dbs_data->queue_stop
>> + * enabled, this time there will be no survivors.
>> + */
>> + if (round)
>> + goto redo;
>> + dbs_data->queue_stop = 0;
>> }
>>
>> /* Will return if we need to evaluate cpu load again or not */
>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
>> index e16a961..9116135 100644
>> --- a/drivers/cpufreq/cpufreq_governor.h
>> +++ b/drivers/cpufreq/cpufreq_governor.h
>> @@ -213,6 +213,7 @@ struct dbs_data {
>> unsigned int min_sampling_rate;
>> int usage_count;
>> void *tuners;
>> + int queue_stop;
>>
>> /* dbs_mutex protects dbs_enable in governor start/stop */
>> struct mutex mutex;
>>
>>>
>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
>>>
>>> ---
>>>
>>> drivers/cpufreq/cpufreq.c | 5 +----
>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
>>> 3 files changed, 13 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index 6a015ad..f8aacf1 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
>>> case CPU_ONLINE:
>>> cpufreq_add_dev(dev, NULL);
>>> break;
>>> - case CPU_DOWN_PREPARE:
>>> + case CPU_POST_DEAD:
>>> case CPU_UP_CANCELED_FROZEN:
>>> __cpufreq_remove_dev(dev, NULL);
>>> break;
>>> - case CPU_DOWN_FAILED:
>>> - cpufreq_add_dev(dev, NULL);
>>> - break;
>>> }
>>> }
>>> return NOTIFY_OK;
>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>> index 4645876..681d5d6 100644
>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>>> unsigned int delay)
>>> {
>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>> -
>>> + /* cpu offline might block existing gov_queue_work() user,
>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
>>> + * thus potentially we can hit offlined CPU */
>>> + if (unlikely(cpu_is_offline(cpu)))
>>> + return;
>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>>> }
>>>
>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>> unsigned int delay, bool all_cpus)
>>> {
>>> int i;
>>> -
>>> + get_online_cpus();
>>> if (!all_cpus) {
>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>> } else {
>>> - get_online_cpus();
>>> for_each_cpu(i, policy->cpus)
>>> __gov_queue_work(i, dbs_data, delay);
>>> - put_online_cpus();
>>> }
>>> + put_online_cpus();
>>> }
>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>
>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>> /* Initiate timer time stamp */
>>> cpu_cdbs->time_stamp = ktime_get();
>>>
>>> - gov_queue_work(dbs_data, policy,
>>> - delay_for_sampling_rate(sampling_rate), true);
>>> + /* hotplug lock already held */
>>> + for_each_cpu(j, policy->cpus)
>>> + __gov_queue_work(j, dbs_data,
>>> + delay_for_sampling_rate(sampling_rate));
>>> break;
>>>
>>> case CPUFREQ_GOV_STOP:
>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>>> index cd9e817..833816e 100644
>>> --- a/drivers/cpufreq/cpufreq_stats.c
>>> +++ b/drivers/cpufreq/cpufreq_stats.c
>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
>>> case CPU_DOWN_PREPARE:
>>> cpufreq_stats_free_sysfs(cpu);
>>> break;
>>> - case CPU_DEAD:
>>> + case CPU_POST_DEAD:
>>> cpufreq_stats_free_table(cpu);
>>> break;
>>> case CPU_UP_CANCELED_FROZEN:
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at http://www.tux.org/lkml/
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-15 02:47:07

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/14/2013 11:56 PM, Rafael J. Wysocki wrote:
[snip]
>> +
>> + /*
>> + * Since there is no lock to prvent re-queue the
>> + * cancelled work, some early cancelled work might
>> + * have been queued again by later cancelled work.
>> + *
>> + * Flush the work again with dbs_data->queue_stop
>> + * enabled, this time there will be no survivors.
>> + */
>> + if (round)
>> + goto redo;
>
> Well, what about doing:
>
> for (round = 2; round; round--)
> for_each_cpu(i, policy->cpus) {
> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> cancel_delayed_work_sync(&cdbs->work);
> }
>
> instead?
>

It could works, while I was a little dislike to use nested 'for' logical...

Anyway, seems like we have not solved the issue yet, so let's put these
down and focus on the fix firstly ;-)

Regards,
Michael Wang

>> + dbs_data->queue_stop = 0;
>> }
>>
>> /* Will return if we need to evaluate cpu load again or not */
>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
>> index e16a961..9116135 100644
>> --- a/drivers/cpufreq/cpufreq_governor.h
>> +++ b/drivers/cpufreq/cpufreq_governor.h
>> @@ -213,6 +213,7 @@ struct dbs_data {
>> unsigned int min_sampling_rate;
>> int usage_count;
>> void *tuners;
>> + int queue_stop;
>>
>> /* dbs_mutex protects dbs_enable in governor start/stop */
>> struct mutex mutex;
>>
>
>

2013-07-15 03:50:56

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/14/2013 08:06 PM, Sergey Senozhatsky wrote:
> On (07/14/13 14:47), Sergey Senozhatsky wrote:
>>
>> Now, as I fixed radeon kms, I can see:
>>
>> [ 806.660530] ------------[ cut here ]------------
>> [ 806.660539] WARNING: CPU: 0 PID: 2389 at arch/x86/kernel/smp.c:124
>> native_smp_send_reschedule+0x57/0x60()
>
> Well, this one is obviously not a lockdep error, I meant that previous
> tests with disabled lockdep were invalid. Will re-do.
>

And may be we could try below patch to get more info, I've moved the timing
of restore stop flag from 'after STOP' to 'before START', I suppose that
could create a window to prevent the work re-queue, it could at least provide
us more info...

I think I may need to setup a environment for debug now, what's the steps to
produce this WARN?

Regards,
Michael Wang


diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index dc9b72e..b1446fe 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
{
int i;

+ if (!dbs_data->queue_start)
+ return;
+
if (!all_cpus) {
__gov_queue_work(smp_processor_id(), dbs_data, delay);
} else {
- get_online_cpus();
for_each_cpu(i, policy->cpus)
__gov_queue_work(i, dbs_data, delay);
- put_online_cpus();
}
}
EXPORT_SYMBOL_GPL(gov_queue_work);
@@ -193,12 +194,26 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
struct cpufreq_policy *policy)
{
struct cpu_dbs_common_info *cdbs;
- int i;
+ int i, round = 2;

+ dbs_data->queue_start = 0;
+redo:
+ round--;
for_each_cpu(i, policy->cpus) {
cdbs = dbs_data->cdata->get_cpu_cdbs(i);
cancel_delayed_work_sync(&cdbs->work);
}
+
+ /*
+ * Since there is no lock to prvent re-queue the
+ * cancelled work, some early cancelled work might
+ * have been queued again by later cancelled work.
+ *
+ * Flush the work again with dbs_data->queue_stop
+ * enabled, this time there will be no survivors.
+ */
+ if (round)
+ goto redo;
}

/* Will return if we need to evaluate cpu load again or not */
@@ -391,6 +406,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,

/* Initiate timer time stamp */
cpu_cdbs->time_stamp = ktime_get();
+ dbs_data->queue_start = 1;

gov_queue_work(dbs_data, policy,
delay_for_sampling_rate(sampling_rate), true);
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index e16a961..9116135 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -213,6 +213,7 @@ struct dbs_data {
unsigned int min_sampling_rate;
int usage_count;
void *tuners;
+ int queue_start;

/* dbs_mutex protects dbs_enable in governor start/stop */
struct mutex mutex;



> -ss
>
>>> Regards,
>>> Michael Wang
>>>
>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>> index dc9b72e..a64b544 100644
>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>> {
>>> int i;
>>>
>>> + if (dbs_data->queue_stop)
>>> + return;
>>> +
>>> if (!all_cpus) {
>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>> } else {
>>> - get_online_cpus();
>>> for_each_cpu(i, policy->cpus)
>>> __gov_queue_work(i, dbs_data, delay);
>>> - put_online_cpus();
>>> }
>>> }
>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>>> struct cpufreq_policy *policy)
>>> {
>>> struct cpu_dbs_common_info *cdbs;
>>> - int i;
>>> + int i, round = 2;
>>>
>>> + dbs_data->queue_stop = 1;
>>> +redo:
>>> + round--;
>>> for_each_cpu(i, policy->cpus) {
>>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
>>> cancel_delayed_work_sync(&cdbs->work);
>>> }
>>> +
>>> + /*
>>> + * Since there is no lock to prvent re-queue the
>>> + * cancelled work, some early cancelled work might
>>> + * have been queued again by later cancelled work.
>>> + *
>>> + * Flush the work again with dbs_data->queue_stop
>>> + * enabled, this time there will be no survivors.
>>> + */
>>> + if (round)
>>> + goto redo;
>>> + dbs_data->queue_stop = 0;
>>> }
>>>
>>> /* Will return if we need to evaluate cpu load again or not */
>>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
>>> index e16a961..9116135 100644
>>> --- a/drivers/cpufreq/cpufreq_governor.h
>>> +++ b/drivers/cpufreq/cpufreq_governor.h
>>> @@ -213,6 +213,7 @@ struct dbs_data {
>>> unsigned int min_sampling_rate;
>>> int usage_count;
>>> void *tuners;
>>> + int queue_stop;
>>>
>>> /* dbs_mutex protects dbs_enable in governor start/stop */
>>> struct mutex mutex;
>>>
>>>>
>>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
>>>>
>>>> ---
>>>>
>>>> drivers/cpufreq/cpufreq.c | 5 +----
>>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
>>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
>>>> 3 files changed, 13 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>>> index 6a015ad..f8aacf1 100644
>>>> --- a/drivers/cpufreq/cpufreq.c
>>>> +++ b/drivers/cpufreq/cpufreq.c
>>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
>>>> case CPU_ONLINE:
>>>> cpufreq_add_dev(dev, NULL);
>>>> break;
>>>> - case CPU_DOWN_PREPARE:
>>>> + case CPU_POST_DEAD:
>>>> case CPU_UP_CANCELED_FROZEN:
>>>> __cpufreq_remove_dev(dev, NULL);
>>>> break;
>>>> - case CPU_DOWN_FAILED:
>>>> - cpufreq_add_dev(dev, NULL);
>>>> - break;
>>>> }
>>>> }
>>>> return NOTIFY_OK;
>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>> index 4645876..681d5d6 100644
>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>>>> unsigned int delay)
>>>> {
>>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>>> -
>>>> + /* cpu offline might block existing gov_queue_work() user,
>>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
>>>> + * thus potentially we can hit offlined CPU */
>>>> + if (unlikely(cpu_is_offline(cpu)))
>>>> + return;
>>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>>>> }
>>>>
>>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>>> unsigned int delay, bool all_cpus)
>>>> {
>>>> int i;
>>>> -
>>>> + get_online_cpus();
>>>> if (!all_cpus) {
>>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>>> } else {
>>>> - get_online_cpus();
>>>> for_each_cpu(i, policy->cpus)
>>>> __gov_queue_work(i, dbs_data, delay);
>>>> - put_online_cpus();
>>>> }
>>>> + put_online_cpus();
>>>> }
>>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>>
>>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>> /* Initiate timer time stamp */
>>>> cpu_cdbs->time_stamp = ktime_get();
>>>>
>>>> - gov_queue_work(dbs_data, policy,
>>>> - delay_for_sampling_rate(sampling_rate), true);
>>>> + /* hotplug lock already held */
>>>> + for_each_cpu(j, policy->cpus)
>>>> + __gov_queue_work(j, dbs_data,
>>>> + delay_for_sampling_rate(sampling_rate));
>>>> break;
>>>>
>>>> case CPUFREQ_GOV_STOP:
>>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>>>> index cd9e817..833816e 100644
>>>> --- a/drivers/cpufreq/cpufreq_stats.c
>>>> +++ b/drivers/cpufreq/cpufreq_stats.c
>>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
>>>> case CPU_DOWN_PREPARE:
>>>> cpufreq_stats_free_sysfs(cpu);
>>>> break;
>>>> - case CPU_DEAD:
>>>> + case CPU_POST_DEAD:
>>>> cpufreq_stats_free_table(cpu);
>>>> break;
>>>> case CPU_UP_CANCELED_FROZEN:
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at http://www.tux.org/lkml/
>>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-15 07:52:42

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/15/2013 11:50 AM, Michael Wang wrote:
> On 07/14/2013 08:06 PM, Sergey Senozhatsky wrote:
>> On (07/14/13 14:47), Sergey Senozhatsky wrote:
>>>
>>> Now, as I fixed radeon kms, I can see:
>>>
>>> [ 806.660530] ------------[ cut here ]------------
>>> [ 806.660539] WARNING: CPU: 0 PID: 2389 at arch/x86/kernel/smp.c:124
>>> native_smp_send_reschedule+0x57/0x60()
>>
>> Well, this one is obviously not a lockdep error, I meant that previous
>> tests with disabled lockdep were invalid. Will re-do.
>>
>
> And may be we could try below patch to get more info, I've moved the timing
> of restore stop flag from 'after STOP' to 'before START', I suppose that
> could create a window to prevent the work re-queue, it could at least provide
> us more info...
>
> I think I may need to setup a environment for debug now, what's the steps to
> produce this WARN?

I have done some test, although I failed to reproduce this WARN, but I
found that the work is still running and re-queue itself after STOP,
even with my prev suggestion...

However, enlarge the stop window as my suggestion below, the work do
stopped...I suppose it will also stop the first WARN too.

Now we need to figure out the reason...

Regards,
Michael Wang

>
> Regards,
> Michael Wang
>
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index dc9b72e..b1446fe 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> {
> int i;
>
> + if (!dbs_data->queue_start)
> + return;
> +
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
> @@ -193,12 +194,26 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> struct cpufreq_policy *policy)
> {
> struct cpu_dbs_common_info *cdbs;
> - int i;
> + int i, round = 2;
>
> + dbs_data->queue_start = 0;
> +redo:
> + round--;
> for_each_cpu(i, policy->cpus) {
> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> cancel_delayed_work_sync(&cdbs->work);
> }
> +
> + /*
> + * Since there is no lock to prvent re-queue the
> + * cancelled work, some early cancelled work might
> + * have been queued again by later cancelled work.
> + *
> + * Flush the work again with dbs_data->queue_stop
> + * enabled, this time there will be no survivors.
> + */
> + if (round)
> + goto redo;
> }
>
> /* Will return if we need to evaluate cpu load again or not */
> @@ -391,6 +406,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>
> /* Initiate timer time stamp */
> cpu_cdbs->time_stamp = ktime_get();
> + dbs_data->queue_start = 1;
>
> gov_queue_work(dbs_data, policy,
> delay_for_sampling_rate(sampling_rate), true);
> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index e16a961..9116135 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -213,6 +213,7 @@ struct dbs_data {
> unsigned int min_sampling_rate;
> int usage_count;
> void *tuners;
> + int queue_start;
>
> /* dbs_mutex protects dbs_enable in governor start/stop */
> struct mutex mutex;
>
>
>
>> -ss
>>
>>>> Regards,
>>>> Michael Wang
>>>>
>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>> index dc9b72e..a64b544 100644
>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>>> {
>>>> int i;
>>>>
>>>> + if (dbs_data->queue_stop)
>>>> + return;
>>>> +
>>>> if (!all_cpus) {
>>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>>> } else {
>>>> - get_online_cpus();
>>>> for_each_cpu(i, policy->cpus)
>>>> __gov_queue_work(i, dbs_data, delay);
>>>> - put_online_cpus();
>>>> }
>>>> }
>>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>>>> struct cpufreq_policy *policy)
>>>> {
>>>> struct cpu_dbs_common_info *cdbs;
>>>> - int i;
>>>> + int i, round = 2;
>>>>
>>>> + dbs_data->queue_stop = 1;
>>>> +redo:
>>>> + round--;
>>>> for_each_cpu(i, policy->cpus) {
>>>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
>>>> cancel_delayed_work_sync(&cdbs->work);
>>>> }
>>>> +
>>>> + /*
>>>> + * Since there is no lock to prvent re-queue the
>>>> + * cancelled work, some early cancelled work might
>>>> + * have been queued again by later cancelled work.
>>>> + *
>>>> + * Flush the work again with dbs_data->queue_stop
>>>> + * enabled, this time there will be no survivors.
>>>> + */
>>>> + if (round)
>>>> + goto redo;
>>>> + dbs_data->queue_stop = 0;
>>>> }
>>>>
>>>> /* Will return if we need to evaluate cpu load again or not */
>>>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
>>>> index e16a961..9116135 100644
>>>> --- a/drivers/cpufreq/cpufreq_governor.h
>>>> +++ b/drivers/cpufreq/cpufreq_governor.h
>>>> @@ -213,6 +213,7 @@ struct dbs_data {
>>>> unsigned int min_sampling_rate;
>>>> int usage_count;
>>>> void *tuners;
>>>> + int queue_stop;
>>>>
>>>> /* dbs_mutex protects dbs_enable in governor start/stop */
>>>> struct mutex mutex;
>>>>
>>>>>
>>>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
>>>>>
>>>>> ---
>>>>>
>>>>> drivers/cpufreq/cpufreq.c | 5 +----
>>>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
>>>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
>>>>> 3 files changed, 13 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>>>> index 6a015ad..f8aacf1 100644
>>>>> --- a/drivers/cpufreq/cpufreq.c
>>>>> +++ b/drivers/cpufreq/cpufreq.c
>>>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
>>>>> case CPU_ONLINE:
>>>>> cpufreq_add_dev(dev, NULL);
>>>>> break;
>>>>> - case CPU_DOWN_PREPARE:
>>>>> + case CPU_POST_DEAD:
>>>>> case CPU_UP_CANCELED_FROZEN:
>>>>> __cpufreq_remove_dev(dev, NULL);
>>>>> break;
>>>>> - case CPU_DOWN_FAILED:
>>>>> - cpufreq_add_dev(dev, NULL);
>>>>> - break;
>>>>> }
>>>>> }
>>>>> return NOTIFY_OK;
>>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>>> index 4645876..681d5d6 100644
>>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>>>>> unsigned int delay)
>>>>> {
>>>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>>>>> -
>>>>> + /* cpu offline might block existing gov_queue_work() user,
>>>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
>>>>> + * thus potentially we can hit offlined CPU */
>>>>> + if (unlikely(cpu_is_offline(cpu)))
>>>>> + return;
>>>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>>>>> }
>>>>>
>>>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
>>>>> unsigned int delay, bool all_cpus)
>>>>> {
>>>>> int i;
>>>>> -
>>>>> + get_online_cpus();
>>>>> if (!all_cpus) {
>>>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>>>> } else {
>>>>> - get_online_cpus();
>>>>> for_each_cpu(i, policy->cpus)
>>>>> __gov_queue_work(i, dbs_data, delay);
>>>>> - put_online_cpus();
>>>>> }
>>>>> + put_online_cpus();
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(gov_queue_work);
>>>>>
>>>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>>> /* Initiate timer time stamp */
>>>>> cpu_cdbs->time_stamp = ktime_get();
>>>>>
>>>>> - gov_queue_work(dbs_data, policy,
>>>>> - delay_for_sampling_rate(sampling_rate), true);
>>>>> + /* hotplug lock already held */
>>>>> + for_each_cpu(j, policy->cpus)
>>>>> + __gov_queue_work(j, dbs_data,
>>>>> + delay_for_sampling_rate(sampling_rate));
>>>>> break;
>>>>>
>>>>> case CPUFREQ_GOV_STOP:
>>>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>>>>> index cd9e817..833816e 100644
>>>>> --- a/drivers/cpufreq/cpufreq_stats.c
>>>>> +++ b/drivers/cpufreq/cpufreq_stats.c
>>>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
>>>>> case CPU_DOWN_PREPARE:
>>>>> cpufreq_stats_free_sysfs(cpu);
>>>>> break;
>>>>> - case CPU_DEAD:
>>>>> + case CPU_POST_DEAD:
>>>>> cpufreq_stats_free_table(cpu);
>>>>> break;
>>>>> case CPU_UP_CANCELED_FROZEN:
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>> the body of a message to [email protected]
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at http://www.tux.org/lkml/
>>>>>
>>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-15 08:29:59

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/15/13 15:52), Michael Wang wrote:
> >
> > And may be we could try below patch to get more info, I've moved the timing
> > of restore stop flag from 'after STOP' to 'before START', I suppose that
> > could create a window to prevent the work re-queue, it could at least provide
> > us more info...
> >
> > I think I may need to setup a environment for debug now, what's the steps to
> > produce this WARN?
>
> I have done some test, although I failed to reproduce this WARN, but I
> found that the work is still running and re-queue itself after STOP,
> even with my prev suggestion...
>
> However, enlarge the stop window as my suggestion below, the work do
> stopped...I suppose it will also stop the first WARN too.
>
> Now we need to figure out the reason...
>

Hello,

WARN is triggered during laptop suspend/hibernate phase.
I'll test your patch soon.

-ss

> Regards,
> Michael Wang
>
> >
> > Regards,
> > Michael Wang
> >
> >
> > diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> > index dc9b72e..b1446fe 100644
> > --- a/drivers/cpufreq/cpufreq_governor.c
> > +++ b/drivers/cpufreq/cpufreq_governor.c
> > @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> > {
> > int i;
> >
> > + if (!dbs_data->queue_start)
> > + return;
> > +
> > if (!all_cpus) {
> > __gov_queue_work(smp_processor_id(), dbs_data, delay);
> > } else {
> > - get_online_cpus();
> > for_each_cpu(i, policy->cpus)
> > __gov_queue_work(i, dbs_data, delay);
> > - put_online_cpus();
> > }
> > }
> > EXPORT_SYMBOL_GPL(gov_queue_work);
> > @@ -193,12 +194,26 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> > struct cpufreq_policy *policy)
> > {
> > struct cpu_dbs_common_info *cdbs;
> > - int i;
> > + int i, round = 2;
> >
> > + dbs_data->queue_start = 0;
> > +redo:
> > + round--;
> > for_each_cpu(i, policy->cpus) {
> > cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> > cancel_delayed_work_sync(&cdbs->work);
> > }
> > +
> > + /*
> > + * Since there is no lock to prvent re-queue the
> > + * cancelled work, some early cancelled work might
> > + * have been queued again by later cancelled work.
> > + *
> > + * Flush the work again with dbs_data->queue_stop
> > + * enabled, this time there will be no survivors.
> > + */
> > + if (round)
> > + goto redo;
> > }
> >
> > /* Will return if we need to evaluate cpu load again or not */
> > @@ -391,6 +406,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> >
> > /* Initiate timer time stamp */
> > cpu_cdbs->time_stamp = ktime_get();
> > + dbs_data->queue_start = 1;
> >
> > gov_queue_work(dbs_data, policy,
> > delay_for_sampling_rate(sampling_rate), true);
> > diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> > index e16a961..9116135 100644
> > --- a/drivers/cpufreq/cpufreq_governor.h
> > +++ b/drivers/cpufreq/cpufreq_governor.h
> > @@ -213,6 +213,7 @@ struct dbs_data {
> > unsigned int min_sampling_rate;
> > int usage_count;
> > void *tuners;
> > + int queue_start;
> >
> > /* dbs_mutex protects dbs_enable in governor start/stop */
> > struct mutex mutex;
> >
> >
> >
> >> -ss
> >>
> >>>> Regards,
> >>>> Michael Wang
> >>>>
> >>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >>>> index dc9b72e..a64b544 100644
> >>>> --- a/drivers/cpufreq/cpufreq_governor.c
> >>>> +++ b/drivers/cpufreq/cpufreq_governor.c
> >>>> @@ -178,13 +178,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> >>>> {
> >>>> int i;
> >>>>
> >>>> + if (dbs_data->queue_stop)
> >>>> + return;
> >>>> +
> >>>> if (!all_cpus) {
> >>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> >>>> } else {
> >>>> - get_online_cpus();
> >>>> for_each_cpu(i, policy->cpus)
> >>>> __gov_queue_work(i, dbs_data, delay);
> >>>> - put_online_cpus();
> >>>> }
> >>>> }
> >>>> EXPORT_SYMBOL_GPL(gov_queue_work);
> >>>> @@ -193,12 +194,27 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
> >>>> struct cpufreq_policy *policy)
> >>>> {
> >>>> struct cpu_dbs_common_info *cdbs;
> >>>> - int i;
> >>>> + int i, round = 2;
> >>>>
> >>>> + dbs_data->queue_stop = 1;
> >>>> +redo:
> >>>> + round--;
> >>>> for_each_cpu(i, policy->cpus) {
> >>>> cdbs = dbs_data->cdata->get_cpu_cdbs(i);
> >>>> cancel_delayed_work_sync(&cdbs->work);
> >>>> }
> >>>> +
> >>>> + /*
> >>>> + * Since there is no lock to prvent re-queue the
> >>>> + * cancelled work, some early cancelled work might
> >>>> + * have been queued again by later cancelled work.
> >>>> + *
> >>>> + * Flush the work again with dbs_data->queue_stop
> >>>> + * enabled, this time there will be no survivors.
> >>>> + */
> >>>> + if (round)
> >>>> + goto redo;
> >>>> + dbs_data->queue_stop = 0;
> >>>> }
> >>>>
> >>>> /* Will return if we need to evaluate cpu load again or not */
> >>>> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> >>>> index e16a961..9116135 100644
> >>>> --- a/drivers/cpufreq/cpufreq_governor.h
> >>>> +++ b/drivers/cpufreq/cpufreq_governor.h
> >>>> @@ -213,6 +213,7 @@ struct dbs_data {
> >>>> unsigned int min_sampling_rate;
> >>>> int usage_count;
> >>>> void *tuners;
> >>>> + int queue_stop;
> >>>>
> >>>> /* dbs_mutex protects dbs_enable in governor start/stop */
> >>>> struct mutex mutex;
> >>>>
> >>>>>
> >>>>> Signed-off-by: Sergey Senozhatsky <[email protected]>
> >>>>>
> >>>>> ---
> >>>>>
> >>>>> drivers/cpufreq/cpufreq.c | 5 +----
> >>>>> drivers/cpufreq/cpufreq_governor.c | 17 +++++++++++------
> >>>>> drivers/cpufreq/cpufreq_stats.c | 2 +-
> >>>>> 3 files changed, 13 insertions(+), 11 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> >>>>> index 6a015ad..f8aacf1 100644
> >>>>> --- a/drivers/cpufreq/cpufreq.c
> >>>>> +++ b/drivers/cpufreq/cpufreq.c
> >>>>> @@ -1943,13 +1943,10 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
> >>>>> case CPU_ONLINE:
> >>>>> cpufreq_add_dev(dev, NULL);
> >>>>> break;
> >>>>> - case CPU_DOWN_PREPARE:
> >>>>> + case CPU_POST_DEAD:
> >>>>> case CPU_UP_CANCELED_FROZEN:
> >>>>> __cpufreq_remove_dev(dev, NULL);
> >>>>> break;
> >>>>> - case CPU_DOWN_FAILED:
> >>>>> - cpufreq_add_dev(dev, NULL);
> >>>>> - break;
> >>>>> }
> >>>>> }
> >>>>> return NOTIFY_OK;
> >>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >>>>> index 4645876..681d5d6 100644
> >>>>> --- a/drivers/cpufreq/cpufreq_governor.c
> >>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
> >>>>> @@ -125,7 +125,11 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
> >>>>> unsigned int delay)
> >>>>> {
> >>>>> struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
> >>>>> -
> >>>>> + /* cpu offline might block existing gov_queue_work() user,
> >>>>> + * unblocking it after CPU_DEAD and before CPU_POST_DEAD.
> >>>>> + * thus potentially we can hit offlined CPU */
> >>>>> + if (unlikely(cpu_is_offline(cpu)))
> >>>>> + return;
> >>>>> mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> >>>>> }
> >>>>>
> >>>>> @@ -133,15 +137,14 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> >>>>> unsigned int delay, bool all_cpus)
> >>>>> {
> >>>>> int i;
> >>>>> -
> >>>>> + get_online_cpus();
> >>>>> if (!all_cpus) {
> >>>>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> >>>>> } else {
> >>>>> - get_online_cpus();
> >>>>> for_each_cpu(i, policy->cpus)
> >>>>> __gov_queue_work(i, dbs_data, delay);
> >>>>> - put_online_cpus();
> >>>>> }
> >>>>> + put_online_cpus();
> >>>>> }
> >>>>> EXPORT_SYMBOL_GPL(gov_queue_work);
> >>>>>
> >>>>> @@ -354,8 +357,10 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> >>>>> /* Initiate timer time stamp */
> >>>>> cpu_cdbs->time_stamp = ktime_get();
> >>>>>
> >>>>> - gov_queue_work(dbs_data, policy,
> >>>>> - delay_for_sampling_rate(sampling_rate), true);
> >>>>> + /* hotplug lock already held */
> >>>>> + for_each_cpu(j, policy->cpus)
> >>>>> + __gov_queue_work(j, dbs_data,
> >>>>> + delay_for_sampling_rate(sampling_rate));
> >>>>> break;
> >>>>>
> >>>>> case CPUFREQ_GOV_STOP:
> >>>>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> >>>>> index cd9e817..833816e 100644
> >>>>> --- a/drivers/cpufreq/cpufreq_stats.c
> >>>>> +++ b/drivers/cpufreq/cpufreq_stats.c
> >>>>> @@ -355,7 +355,7 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
> >>>>> case CPU_DOWN_PREPARE:
> >>>>> cpufreq_stats_free_sysfs(cpu);
> >>>>> break;
> >>>>> - case CPU_DEAD:
> >>>>> + case CPU_POST_DEAD:
> >>>>> cpufreq_stats_free_table(cpu);
> >>>>> break;
> >>>>> case CPU_UP_CANCELED_FROZEN:
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >>>>> the body of a message to [email protected]
> >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>> Please read the FAQ at http://www.tux.org/lkml/
> >>>>>
> >>>>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at http://www.tux.org/lkml/
> >>
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>

2013-07-15 13:23:21

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/15/2013 01:59 PM, Sergey Senozhatsky wrote:
> On (07/15/13 15:52), Michael Wang wrote:
>>>
>>> And may be we could try below patch to get more info, I've moved the timing
>>> of restore stop flag from 'after STOP' to 'before START', I suppose that
>>> could create a window to prevent the work re-queue, it could at least provide
>>> us more info...
>>>
>>> I think I may need to setup a environment for debug now, what's the steps to
>>> produce this WARN?
>>
>> I have done some test, although I failed to reproduce this WARN, but I
>> found that the work is still running and re-queue itself after STOP,
>> even with my prev suggestion...
>>
>> However, enlarge the stop window as my suggestion below, the work do
>> stopped...I suppose it will also stop the first WARN too.
>>
>> Now we need to figure out the reason...
>>
>
> Hello,
>
> WARN is triggered during laptop suspend/hibernate phase.
> I'll test your patch soon.
>

Hi,

I think I finally found out what exactly is going wrong! :-)

I tried reproducing the problem on my machine, and found that the problem
(warning about IPIs to offline CPUs) happens *only* while doing suspend/resume
and not during halt/shutdown/reboot or regular CPU hotplug via sysfs files.
That got me thinking and I finally figured out that commit a66b2e5 is again
the culprit.

So here is the solution:

On 3.11-rc1, apply these patches in the order mentioned below, and check
whether it fixes _all_ problems (both the warnings about IPI as well as the
lockdep splat).

1. Patch given in: https://lkml.org/lkml/2013/7/11/661
(Just apply patch 1, not the entire patchset).

2. Apply the patch shown below, on top of the above patch:

---------------------------------------------------------------------------


From: Srivatsa S. Bhat <[email protected]>
Subject: [PATCH] cpufreq: Revert commit 2f7021a to fix CPU hotplug regression

commit 2f7021a (cpufreq: protect 'policy->cpus' from offlining during
__gov_queue_work()) caused a regression in CPU hotplug, because it lead
to a deadlock between cpufreq governor worker thread and the CPU hotplug
writer task.

Lockdep splat corresponding to this deadlock is shown below:

[ 60.277396] ======================================================
[ 60.277400] [ INFO: possible circular locking dependency detected ]
[ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
[ 60.277411] -------------------------------------------------------
[ 60.277417] bash/2225 is trying to acquire lock:
[ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
[ 60.277444] but task is already holding lock:
[ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[ 60.277465] which lock already depends on the new lock.

[ 60.277472] the existing dependency chain (in reverse order) is:
[ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
[ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
[ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
[ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
[ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
[ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
[ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
[ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280
[ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[ 60.277693] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[ 60.277701] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[ 60.277709] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[ 60.277719] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[ 60.277728] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[ 60.277737] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[ 60.277747] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[ 60.277759] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[ 60.277768] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50
[ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0
[ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[ 60.277842] other info that might help us debug this:

[ 60.277848] Chain exists of:
(&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock

[ 60.277864] Possible unsafe locking scenario:

[ 60.277869] CPU0 CPU1
[ 60.277873] ---- ----
[ 60.277877] lock(cpu_hotplug.lock);
[ 60.277885] lock(&j_cdbs->timer_mutex);
[ 60.277892] lock(cpu_hotplug.lock);
[ 60.277900] lock((&(&j_cdbs->work)->work));
[ 60.277907] *** DEADLOCK ***

[ 60.277915] 6 locks held by bash/2225:
[ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
[ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
[ 60.277954] #2: (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
[ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
[ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
[ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[ 60.278023] stack backtrace:
[ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
[ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
[ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
[ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
[ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
[ 60.278081] Call Trace:
[ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
[ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
[ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
[ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
[ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
[ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
[ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
[ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
[ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[ 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[ 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[ 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
[ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
[ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
[ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
[ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[ 60.280582] smpboot: CPU 1 is now offline


The intent of this commit was to avoid warnings during CPU hotplug, which
indicated that offline CPUs were getting IPIs from the cpufreq governor's
work items. But the real root-cause of that problem was commit a66b2e5
(cpufreq: Preserve sysfs files across suspend/resume) because it totally
skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume
path, and hence it never actually shut down the cpufreq governor's worker
threads during CPU offline in the suspend/resume path.

Reflecting back, the reason why we never suspected that commit as the
root-cause earlier, was that the original issue was reported with just the
halt command and nobody had brought in suspend/resume to the equation.

The reason for _that_ in turn, it turns out is that, earlier halt/shutdown
was being done by disabling non-boot CPUs while tasks were frozen, just like
suspend/resume.... but commit cf7df378a (reboot: rigrate shutdown/reboot to
boot cpu) which came somewhere along that very same time changed that logic:
shutdown/halt no longer takes CPUs offline.
Thus, the test-cases for reproducing the bug were vastly different and thus
we went totally off the trail.

Overall, it was one hell of a confusion with so many commits affecting
each other and also affecting the symptoms of the problems in subtle
ways. Finally, now since the original problematic commit (a66b2e5) has been
completely reverted, revert this intermediate fix too (2f7021a), to fix the
CPU hotplug deadlock. Phew!

Reported-by: Sergey Senozhatsky <[email protected]>
Reported-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
---

drivers/cpufreq/cpufreq_governor.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 4645876..7b839a8 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -25,7 +25,6 @@
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/workqueue.h>
-#include <linux/cpu.h>

#include "cpufreq_governor.h"

@@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
if (!all_cpus) {
__gov_queue_work(smp_processor_id(), dbs_data, delay);
} else {
- get_online_cpus();
for_each_cpu(i, policy->cpus)
__gov_queue_work(i, dbs_data, delay);
- put_online_cpus();
}
}
EXPORT_SYMBOL_GPL(gov_queue_work);

2013-07-15 13:35:57

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/15/2013 06:49 PM, Srivatsa S. Bhat wrote:
[...]
> The intent of this commit was to avoid warnings during CPU hotplug, which
> indicated that offline CPUs were getting IPIs from the cpufreq governor's
> work items. But the real root-cause of that problem was commit a66b2e5
> (cpufreq: Preserve sysfs files across suspend/resume) because it totally
> skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume
> path, and hence it never actually shut down the cpufreq governor's worker
> threads during CPU offline in the suspend/resume path.
>
> Reflecting back, the reason why we never suspected that commit as the
> root-cause earlier, was that the original issue was reported with just the
> halt command and nobody had brought in suspend/resume to the equation.
>
> The reason for _that_ in turn, it turns out is that, earlier halt/shutdown
> was being done by disabling non-boot CPUs while tasks were frozen, just like
> suspend/resume.... but commit cf7df378a (reboot: rigrate shutdown/reboot to
> boot cpu) which came somewhere along that very same time changed that logic:
> shutdown/halt no longer takes CPUs offline.
> Thus, the test-cases for reproducing the bug were vastly different and thus
> we went totally off the trail.
>
> Overall, it was one hell of a confusion with so many commits affecting
> each other and also affecting the symptoms of the problems in subtle
> ways. Finally, now since the original problematic commit (a66b2e5) has been
> completely reverted, revert this intermediate fix too (2f7021a), to fix the
> CPU hotplug deadlock. Phew!
>
> Reported-by: Sergey Senozhatsky <[email protected]>
> Reported-by: Bartlomiej Zolnierkiewicz <[email protected]>
> Signed-off-by: Srivatsa S. Bhat <[email protected]>

Forgot to add: If this solves the issues people are facing, IMHO this should
also be CC'ed to stable just like the full-revert of a66b2e5, .

Regards,
Srivatsa S. Bhat

> ---
>
> drivers/cpufreq/cpufreq_governor.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 4645876..7b839a8 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -25,7 +25,6 @@
> #include <linux/slab.h>
> #include <linux/types.h>
> #include <linux/workqueue.h>
> -#include <linux/cpu.h>
>
> #include "cpufreq_governor.h"
>
> @@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
>

2013-07-15 20:49:47

by Peter Wu

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

Hi,

I think I also encountered this similar issue after resume (and possibly a
real deadlock yesterday before/during suspend?). One message:

[ 71.204848] ======================================================
[ 71.204850] [ INFO: possible circular locking dependency detected ]
[ 71.204852] 3.11.0-rc1cold-00008-g47188d3 #1 Tainted: G W
[ 71.204854] -------------------------------------------------------
[ 71.204855] ondemand/2034 is trying to acquire lock:
[ 71.204857] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8104ba31>] get_online_cpus+0x41/0x60
[ 71.204869]
[ 71.204869] but task is already holding lock:
[ 71.204870] (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<ffffffff8151fba9>] lock_policy_rwsem_write+0x39/0x40
[ 71.204879]
[ 71.204879] which lock already depends on the new lock.
[ 71.204879]
[ 71.204881]
[ 71.204881] the existing dependency chain (in reverse order) is:
[ 71.204884]
-> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
[ 71.204889] [<ffffffff810ac130>] lock_acquire+0x90/0x140
[ 71.204894] [<ffffffff81660fe9>] down_write+0x49/0x6b
[ 71.204898] [<ffffffff8151fba9>] lock_policy_rwsem_write+0x39/0x40
[ 71.204901] [<ffffffff815213e0>] cpufreq_update_policy+0x40/0x130
[ 71.204904] [<ffffffff81522327>] cpufreq_stat_cpu_callback+0x27/0x70
[ 71.204907] [<ffffffff81668acd>] notifier_call_chain+0x4d/0x70
[ 71.204911] [<ffffffff8107730e>] __raw_notifier_call_chain+0xe/0x10
[ 71.204915] [<ffffffff8104b780>] __cpu_notify+0x20/0x40
[ 71.204918] [<ffffffff8104b916>] _cpu_up+0x116/0x170
[ 71.204921] [<ffffffff8164d540>] enable_nonboot_cpus+0x90/0xe0
[ 71.204926] [<ffffffff81098bd1>] suspend_devices_and_enter+0x301/0x420
[ 71.204930] [<ffffffff81098ec0>] pm_suspend+0x1d0/0x230
[ 71.205000] [<ffffffff81097b2a>] state_store+0x8a/0x100
[ 71.205005] [<ffffffff8131559f>] kobj_attr_store+0xf/0x30
[ 71.205009] [<ffffffff811fac36>] sysfs_write_file+0xe6/0x170
[ 71.205014] [<ffffffff81183c5e>] vfs_write+0xce/0x200
[ 71.205018] [<ffffffff81184165>] SyS_write+0x55/0xa0
[ 71.205022] [<ffffffff8166d3c2>] system_call_fastpath+0x16/0x1b
[ 71.205025]
-> #0 (cpu_hotplug.lock){+.+.+.}:
[ 71.205093] [<ffffffff810ab35c>] __lock_acquire+0x174c/0x1ed0
[ 71.205096] [<ffffffff810ac130>] lock_acquire+0x90/0x140
[ 71.205099] [<ffffffff8165f7b0>] mutex_lock_nested+0x70/0x380
[ 71.205102] [<ffffffff8104ba31>] get_online_cpus+0x41/0x60
[ 71.205217] [<ffffffff815247f8>] gov_queue_work+0x28/0xc0
[ 71.205221] [<ffffffff81524d97>] cpufreq_governor_dbs+0x507/0x710
[ 71.205224] [<ffffffff81522a17>] od_cpufreq_governor_dbs+0x17/0x20
[ 71.205226] [<ffffffff8151fec7>] __cpufreq_governor+0x87/0x1c0
[ 71.205230] [<ffffffff81520445>] __cpufreq_set_policy+0x1b5/0x1e0
[ 71.205232] [<ffffffff8152055a>] store_scaling_governor+0xea/0x1f0
[ 71.205235] [<ffffffff8151fcbd>] store+0x6d/0xc0
[ 71.205238] [<ffffffff811fac36>] sysfs_write_file+0xe6/0x170
[ 71.205305] [<ffffffff81183c5e>] vfs_write+0xce/0x200
[ 71.205308] [<ffffffff81184165>] SyS_write+0x55/0xa0
[ 71.205311] [<ffffffff8166d3c2>] system_call_fastpath+0x16/0x1b
[ 71.205313]
[ 71.205313] other info that might help us debug this:
[ 71.205313]
[ 71.205315] Possible unsafe locking scenario:
[ 71.205315]
[ 71.205317] CPU0 CPU1
[ 71.205318] ---- ----
[ 71.205383] lock(&per_cpu(cpu_policy_rwsem, cpu));
[ 71.205386] lock(cpu_hotplug.lock);
[ 71.205389] lock(&per_cpu(cpu_policy_rwsem, cpu));
[ 71.205392] lock(cpu_hotplug.lock);
[ 71.205509]
[ 71.205509] *** DEADLOCK ***
[ 71.205509]
[ 71.205511] 4 locks held by ondemand/2034:
[ 71.205512] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81183d63>] vfs_write+0x1d3/0x200
[ 71.205520] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811fab94>] sysfs_write_file+0x44/0x170
[ 71.205640] #2: (s_active#178){.+.+.+}, at: [<ffffffff811fac1d>] sysfs_write_file+0xcd/0x170
[ 71.205648] #3: (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<ffffffff8151fba9>] lock_policy_rwsem_write+0x39/0x40
[ 71.205655]
[ 71.205655] stack backtrace:
[ 71.205658] CPU: 1 PID: 2034 Comm: ondemand Tainted: G W 3.11.0-rc1cold-00008-g47188d3 #1
[ 71.205660] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z68X-UD3H-B3, BIOS U1l 03/08/2013
[ 71.205773] ffffffff8218fd20 ffff8805fc5d38e8 ffffffff8165b74d 0000000000000000
[ 71.205778] ffffffff8211f130 ffff8805fc5d3938 ffffffff81657cef ffffffff8218fd20
[ 71.205783] ffff8805fc5d39c0 ffff8805fc5d3938 ffff880603726678 ffff880603725f10
[ 71.205900] Call Trace:
[ 71.205903] [<ffffffff8165b74d>] dump_stack+0x55/0x76
[ 71.205907] [<ffffffff81657cef>] print_circular_bug+0x1fb/0x20c
[ 71.205910] [<ffffffff810ab35c>] __lock_acquire+0x174c/0x1ed0
[ 71.205913] [<ffffffff810aa00c>] ? __lock_acquire+0x3fc/0x1ed0
[ 71.205916] [<ffffffff8104ba31>] ? get_online_cpus+0x41/0x60
[ 71.205919] [<ffffffff810ac130>] lock_acquire+0x90/0x140
[ 71.205921] [<ffffffff8104ba31>] ? get_online_cpus+0x41/0x60
[ 71.205924] [<ffffffff8165f7b0>] mutex_lock_nested+0x70/0x380
[ 71.205927] [<ffffffff8104ba31>] ? get_online_cpus+0x41/0x60
[ 71.205930] [<ffffffff810a89ee>] ? mark_held_locks+0x7e/0x150
[ 71.205933] [<ffffffff81660b9e>] ? mutex_unlock+0xe/0x10
[ 71.205936] [<ffffffff8165fb91>] ? __mutex_unlock_slowpath+0xd1/0x180
[ 71.205938] [<ffffffff8104ba31>] get_online_cpus+0x41/0x60
[ 71.205941] [<ffffffff815247f8>] gov_queue_work+0x28/0xc0
[ 71.205944] [<ffffffff81524d97>] cpufreq_governor_dbs+0x507/0x710
[ 71.205947] [<ffffffff81522a17>] od_cpufreq_governor_dbs+0x17/0x20
[ 71.205950] [<ffffffff8151fec7>] __cpufreq_governor+0x87/0x1c0
[ 71.206009] [<ffffffff81520445>] __cpufreq_set_policy+0x1b5/0x1e0
[ 71.206012] [<ffffffff8152055a>] store_scaling_governor+0xea/0x1f0
[ 71.206014] [<ffffffff815214d0>] ? cpufreq_update_policy+0x130/0x130
[ 71.206018] [<ffffffff8151fba9>] ? lock_policy_rwsem_write+0x39/0x40
[ 71.206021] [<ffffffff8151fcbd>] store+0x6d/0xc0
[ 71.206024] [<ffffffff811fac36>] sysfs_write_file+0xe6/0x170
[ 71.206026] [<ffffffff81183c5e>] vfs_write+0xce/0x200
[ 71.206029] [<ffffffff81184165>] SyS_write+0x55/0xa0
[ 71.206032] [<ffffffff8166d3c2>] system_call_fastpath+0x16/0x1b

(the other was with the locks acquired lock reversed, i.e. cpu_hotplug.lock
was held on CPU0 and CPU1 tries to lock &per_cpu(...)). This one is tagged
"ondemand", the other one "pm-suspend" (reproducable with a high probability).

On Monday 15 July 2013 18:49:39 Srivatsa S. Bhat wrote:
> I think I finally found out what exactly is going wrong! :-)
>
> I tried reproducing the problem on my machine, and found that the problem
> (warning about IPIs to offline CPUs) happens *only* while doing
> suspend/resume and not during halt/shutdown/reboot or regular CPU hotplug
> via sysfs files. That got me thinking and I finally figured out that commit
> a66b2e5 is again the culprit.
>
> So here is the solution:
>
> On 3.11-rc1, apply these patches in the order mentioned below, and check
> whether it fixes _all_ problems (both the warnings about IPI as well as the
> lockdep splat).
>
> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
> (Just apply patch 1, not the entire patchset).
>
> 2. Apply the patch shown below, on top of the above patch:
>
> ---------------------------------------------------------------------------
>
>
> From: Srivatsa S. Bhat <[email protected]>
> Subject: [PATCH] cpufreq: Revert commit 2f7021a to fix CPU hotplug
> regression

Please use '2f7021a8', without the '8' the commit hash is ambiguous.
(git describe says: v3.10-rc4-2-g2f7021a8)

I ran six times `pm-suspend` without any lockdep warnings. I reverted a66b2e5
and 2f7021a8 on top of current master (47188d3).

Regards,
Peter

> commit 2f7021a (cpufreq: protect 'policy->cpus' from offlining during
> __gov_queue_work()) caused a regression in CPU hotplug, because it lead
> to a deadlock between cpufreq governor worker thread and the CPU hotplug
> writer task.
>
> Lockdep splat corresponding to this deadlock is shown below:
>
> [ 60.277396] ======================================================
> [ 60.277400] [ INFO: possible circular locking dependency detected ]
> [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
> [ 60.277411] -------------------------------------------------------
> [ 60.277417] bash/2225 is trying to acquire lock:
> [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>]
> flush_work+0x5/0x280 [ 60.277444] but task is already holding lock:
> [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>]
> cpu_hotplug_begin+0x2b/0x60 [ 60.277465] which lock already depends on
> the new lock.
>
> [ 60.277472] the existing dependency chain (in reverse order) is:
> [ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
> [ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
> [ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
> [ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
> [ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
> [ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
> [ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
> [ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
> [ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
> [ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
> [ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
> [ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
> [ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
> [ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
> [ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
> [ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
> [ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
> [ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
> [ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280
> [ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
> [ 60.277693] [<ffffffff81062e53>]
> cancel_delayed_work_sync+0x13/0x20 [ 60.277701]
> [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0 [ 60.277709]
> [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20 [ 60.277719]
> [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100 [ 60.277728]
> [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0 [ 60.277737]
> [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c [ 60.277747]
> [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110 [ 60.277759]
> [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10 [ 60.277768]
> [<ffffffff815a0a68>] _cpu_down+0x88/0x330
> [ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50
> [ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0
> [ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
> [ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
> [ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
> [ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
> [ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
> [ 60.277842] other info that might help us debug this:
>
> [ 60.277848] Chain exists of:
> (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
>
> [ 60.277864] Possible unsafe locking scenario:
>
> [ 60.277869] CPU0 CPU1
> [ 60.277873] ---- ----
> [ 60.277877] lock(cpu_hotplug.lock);
> [ 60.277885] lock(&j_cdbs->timer_mutex);
> [ 60.277892] lock(cpu_hotplug.lock);
> [ 60.277900] lock((&(&j_cdbs->work)->work));
> [ 60.277907] *** DEADLOCK ***
>
> [ 60.277915] 6 locks held by bash/2225:
> [ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>]
> vfs_write+0x1c3/0x1f0 [ 60.277937] #1: (&buffer->mutex){+.+.+.}, at:
> [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150 [ 60.277954] #2:
> (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
> [ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at:
> [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20 [ 60.277990] #4:
> (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
> [ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>]
> cpu_hotplug_begin+0x2b/0x60 [ 60.278023] stack backtrace:
> [ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted
> 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 [ 60.278037] Hardware name:
> Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011 [
> 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90
> ffff88014df6ba38 [ 60.278055] ffffffff815b0a8d ffff880150ed3f60
> ffff880150ed4770 3871c4002c8980b2 [ 60.278068] ffff880150ed4748
> ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00 [ 60.278081] Call
> Trace:
> [ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
> [ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
> [ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
> [ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
> [ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
> [ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
> [ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
> [ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
> [ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
> [ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
> [ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
> [ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
> [ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
> [ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
> [ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c [
> 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110 [
> 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10 [
> 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
> [ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
> [ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
> [ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
> [ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
> [ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
> [ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
> [ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
> [ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
> [ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
> [ 60.280582] smpboot: CPU 1 is now offline
>
>
> The intent of this commit was to avoid warnings during CPU hotplug, which
> indicated that offline CPUs were getting IPIs from the cpufreq governor's
> work items. But the real root-cause of that problem was commit a66b2e5
> (cpufreq: Preserve sysfs files across suspend/resume) because it totally
> skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume
> path, and hence it never actually shut down the cpufreq governor's worker
> threads during CPU offline in the suspend/resume path.
>
> Reflecting back, the reason why we never suspected that commit as the
> root-cause earlier, was that the original issue was reported with just the
> halt command and nobody had brought in suspend/resume to the equation.
>
> The reason for _that_ in turn, it turns out is that, earlier halt/shutdown
> was being done by disabling non-boot CPUs while tasks were frozen, just like
> suspend/resume.... but commit cf7df378a (reboot: rigrate shutdown/reboot
> to boot cpu) which came somewhere along that very same time changed that
> logic: shutdown/halt no longer takes CPUs offline.
> Thus, the test-cases for reproducing the bug were vastly different and thus
> we went totally off the trail.
>
> Overall, it was one hell of a confusion with so many commits affecting
> each other and also affecting the symptoms of the problems in subtle
> ways. Finally, now since the original problematic commit (a66b2e5) has been
> completely reverted, revert this intermediate fix too (2f7021a), to fix the
> CPU hotplug deadlock. Phew!
>
> Reported-by: Sergey Senozhatsky <[email protected]>
> Reported-by: Bartlomiej Zolnierkiewicz <[email protected]>
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> drivers/cpufreq/cpufreq_governor.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c
> b/drivers/cpufreq/cpufreq_governor.c index 4645876..7b839a8 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -25,7 +25,6 @@
> #include <linux/slab.h>
> #include <linux/types.h>
> #include <linux/workqueue.h>
> -#include <linux/cpu.h>
>
> #include "cpufreq_governor.h"
>
> @@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct
> cpufreq_policy *policy, if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);

2013-07-15 23:21:37

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/15/13 18:49), Srivatsa S. Bhat wrote:
[..]
> So here is the solution:
>
> On 3.11-rc1, apply these patches in the order mentioned below, and check
> whether it fixes _all_ problems (both the warnings about IPI as well as the
> lockdep splat).
>
> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
> (Just apply patch 1, not the entire patchset).
>
> 2. Apply the patch shown below, on top of the above patch:
>
> ---------------------------------------------------------------------------
>

Hello Srivatsa,
Thanks, I'll test a bit later -- in the morning. (laptop stopped resuming from
suspend, probably radeon dmp).



Shouldn't we also kick the console lock?


kernel/printk.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/printk.c b/kernel/printk.c
index d37d45c..3e20233 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1926,8 +1926,11 @@ static int __cpuinit console_cpu_notify(struct notifier_block *self,
{
switch (action) {
case CPU_ONLINE:
+ case CPU_ONLINE_FROZEN:
case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
case CPU_DOWN_FAILED:
+ case CPU_DOWN_FAILED_FROZEN:
case CPU_UP_CANCELED:
console_lock();
console_unlock();



>
> From: Srivatsa S. Bhat <[email protected]>
> Subject: [PATCH] cpufreq: Revert commit 2f7021a to fix CPU hotplug regression
>
> commit 2f7021a (cpufreq: protect 'policy->cpus' from offlining during
> __gov_queue_work()) caused a regression in CPU hotplug, because it lead
> to a deadlock between cpufreq governor worker thread and the CPU hotplug
> writer task.
>
> Lockdep splat corresponding to this deadlock is shown below:
>
> [ 60.277396] ======================================================
> [ 60.277400] [ INFO: possible circular locking dependency detected ]
> [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
> [ 60.277411] -------------------------------------------------------
> [ 60.277417] bash/2225 is trying to acquire lock:
> [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
> [ 60.277444] but task is already holding lock:
> [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> [ 60.277465] which lock already depends on the new lock.
>
> [ 60.277472] the existing dependency chain (in reverse order) is:
> [ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
> [ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
> [ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
> [ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
> [ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
> [ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
> [ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
> [ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
> [ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
> [ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
> [ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
> [ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
> [ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
> [ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
> [ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
> [ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
> [ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
> [ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
> [ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280
> [ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
> [ 60.277693] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
> [ 60.277701] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
> [ 60.277709] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
> [ 60.277719] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
> [ 60.277728] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
> [ 60.277737] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
> [ 60.277747] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
> [ 60.277759] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
> [ 60.277768] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
> [ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50
> [ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0
> [ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
> [ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
> [ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
> [ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
> [ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
> [ 60.277842] other info that might help us debug this:
>
> [ 60.277848] Chain exists of:
> (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
>
> [ 60.277864] Possible unsafe locking scenario:
>
> [ 60.277869] CPU0 CPU1
> [ 60.277873] ---- ----
> [ 60.277877] lock(cpu_hotplug.lock);
> [ 60.277885] lock(&j_cdbs->timer_mutex);
> [ 60.277892] lock(cpu_hotplug.lock);
> [ 60.277900] lock((&(&j_cdbs->work)->work));
> [ 60.277907] *** DEADLOCK ***
>
> [ 60.277915] 6 locks held by bash/2225:
> [ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
> [ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
> [ 60.277954] #2: (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
> [ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
> [ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
> [ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> [ 60.278023] stack backtrace:
> [ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
> [ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
> [ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
> [ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
> [ 60.278081] Call Trace:
> [ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
> [ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
> [ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
> [ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
> [ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
> [ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
> [ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
> [ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
> [ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
> [ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
> [ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
> [ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
> [ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
> [ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
> [ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
> [ 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
> [ 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
> [ 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
> [ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
> [ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
> [ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
> [ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
> [ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
> [ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
> [ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
> [ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
> [ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
> [ 60.280582] smpboot: CPU 1 is now offline
>
>
> The intent of this commit was to avoid warnings during CPU hotplug, which
> indicated that offline CPUs were getting IPIs from the cpufreq governor's
> work items. But the real root-cause of that problem was commit a66b2e5
> (cpufreq: Preserve sysfs files across suspend/resume) because it totally
> skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume
> path, and hence it never actually shut down the cpufreq governor's worker
> threads during CPU offline in the suspend/resume path.
>
> Reflecting back, the reason why we never suspected that commit as the
> root-cause earlier, was that the original issue was reported with just the
> halt command and nobody had brought in suspend/resume to the equation.
>
> The reason for _that_ in turn, it turns out is that, earlier halt/shutdown
> was being done by disabling non-boot CPUs while tasks were frozen, just like
> suspend/resume.... but commit cf7df378a (reboot: rigrate shutdown/reboot to
> boot cpu) which came somewhere along that very same time changed that logic:
> shutdown/halt no longer takes CPUs offline.
> Thus, the test-cases for reproducing the bug were vastly different and thus
> we went totally off the trail.
>
> Overall, it was one hell of a confusion with so many commits affecting
> each other and also affecting the symptoms of the problems in subtle
> ways. Finally, now since the original problematic commit (a66b2e5) has been
> completely reverted, revert this intermediate fix too (2f7021a), to fix the
> CPU hotplug deadlock. Phew!
>
> Reported-by: Sergey Senozhatsky <[email protected]>
> Reported-by: Bartlomiej Zolnierkiewicz <[email protected]>
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> drivers/cpufreq/cpufreq_governor.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 4645876..7b839a8 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -25,7 +25,6 @@
> #include <linux/slab.h>
> #include <linux/types.h>
> #include <linux/workqueue.h>
> -#include <linux/cpu.h>
>
> #include "cpufreq_governor.h"
>
> @@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
>

2013-07-16 02:19:29

by Michael wang

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/15/2013 09:19 PM, Srivatsa S. Bhat wrote:
> On 07/15/2013 01:59 PM, Sergey Senozhatsky wrote:
>> On (07/15/13 15:52), Michael Wang wrote:
>>>>
>>>> And may be we could try below patch to get more info, I've moved the timing
>>>> of restore stop flag from 'after STOP' to 'before START', I suppose that
>>>> could create a window to prevent the work re-queue, it could at least provide
>>>> us more info...
>>>>
>>>> I think I may need to setup a environment for debug now, what's the steps to
>>>> produce this WARN?
>>>
>>> I have done some test, although I failed to reproduce this WARN, but I
>>> found that the work is still running and re-queue itself after STOP,
>>> even with my prev suggestion...
>>>
>>> However, enlarge the stop window as my suggestion below, the work do
>>> stopped...I suppose it will also stop the first WARN too.
>>>
>>> Now we need to figure out the reason...
>>>
>>
>> Hello,
>>
>> WARN is triggered during laptop suspend/hibernate phase.
>> I'll test your patch soon.
>>
>
> Hi,
>
> I think I finally found out what exactly is going wrong! :-)
>
> I tried reproducing the problem on my machine, and found that the problem
> (warning about IPIs to offline CPUs) happens *only* while doing suspend/resume
> and not during halt/shutdown/reboot or regular CPU hotplug via sysfs files.
> That got me thinking and I finally figured out that commit a66b2e5 is again
> the culprit.
>
> So here is the solution:

Nice to know the problem got solved ;-) (although there are still
something unclear to me...anyway).

Regards,
Michael Wang


>
> On 3.11-rc1, apply these patches in the order mentioned below, and check
> whether it fixes _all_ problems (both the warnings about IPI as well as the
> lockdep splat).
>
> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
> (Just apply patch 1, not the entire patchset).
>
> 2. Apply the patch shown below, on top of the above patch:
>
> ---------------------------------------------------------------------------
>
>
> From: Srivatsa S. Bhat <[email protected]>
> Subject: [PATCH] cpufreq: Revert commit 2f7021a to fix CPU hotplug regression
>
> commit 2f7021a (cpufreq: protect 'policy->cpus' from offlining during
> __gov_queue_work()) caused a regression in CPU hotplug, because it lead
> to a deadlock between cpufreq governor worker thread and the CPU hotplug
> writer task.
>
> Lockdep splat corresponding to this deadlock is shown below:
>
> [ 60.277396] ======================================================
> [ 60.277400] [ INFO: possible circular locking dependency detected ]
> [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
> [ 60.277411] -------------------------------------------------------
> [ 60.277417] bash/2225 is trying to acquire lock:
> [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
> [ 60.277444] but task is already holding lock:
> [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> [ 60.277465] which lock already depends on the new lock.
>
> [ 60.277472] the existing dependency chain (in reverse order) is:
> [ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
> [ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
> [ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
> [ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
> [ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
> [ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
> [ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
> [ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
> [ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
> [ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
> [ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
> [ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
> [ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
> [ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
> [ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
> [ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
> [ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
> [ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
> [ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280
> [ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
> [ 60.277693] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
> [ 60.277701] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
> [ 60.277709] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
> [ 60.277719] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
> [ 60.277728] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
> [ 60.277737] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
> [ 60.277747] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
> [ 60.277759] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
> [ 60.277768] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
> [ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50
> [ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0
> [ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
> [ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
> [ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
> [ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
> [ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
> [ 60.277842] other info that might help us debug this:
>
> [ 60.277848] Chain exists of:
> (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
>
> [ 60.277864] Possible unsafe locking scenario:
>
> [ 60.277869] CPU0 CPU1
> [ 60.277873] ---- ----
> [ 60.277877] lock(cpu_hotplug.lock);
> [ 60.277885] lock(&j_cdbs->timer_mutex);
> [ 60.277892] lock(cpu_hotplug.lock);
> [ 60.277900] lock((&(&j_cdbs->work)->work));
> [ 60.277907] *** DEADLOCK ***
>
> [ 60.277915] 6 locks held by bash/2225:
> [ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
> [ 60.277937] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
> [ 60.277954] #2: (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
> [ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
> [ 60.277990] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
> [ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
> [ 60.278023] stack backtrace:
> [ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
> [ 60.278037] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
> [ 60.278055] ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
> [ 60.278068] ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
> [ 60.278081] Call Trace:
> [ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
> [ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
> [ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
> [ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
> [ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
> [ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
> [ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
> [ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
> [ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
> [ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
> [ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
> [ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
> [ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
> [ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
> [ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
> [ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
> [ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
> [ 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
> [ 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
> [ 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
> [ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
> [ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
> [ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
> [ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
> [ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
> [ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
> [ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
> [ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
> [ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
> [ 60.280582] smpboot: CPU 1 is now offline
>
>
> The intent of this commit was to avoid warnings during CPU hotplug, which
> indicated that offline CPUs were getting IPIs from the cpufreq governor's
> work items. But the real root-cause of that problem was commit a66b2e5
> (cpufreq: Preserve sysfs files across suspend/resume) because it totally
> skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume
> path, and hence it never actually shut down the cpufreq governor's worker
> threads during CPU offline in the suspend/resume path.
>
> Reflecting back, the reason why we never suspected that commit as the
> root-cause earlier, was that the original issue was reported with just the
> halt command and nobody had brought in suspend/resume to the equation.
>
> The reason for _that_ in turn, it turns out is that, earlier halt/shutdown
> was being done by disabling non-boot CPUs while tasks were frozen, just like
> suspend/resume.... but commit cf7df378a (reboot: rigrate shutdown/reboot to
> boot cpu) which came somewhere along that very same time changed that logic:
> shutdown/halt no longer takes CPUs offline.
> Thus, the test-cases for reproducing the bug were vastly different and thus
> we went totally off the trail.
>
> Overall, it was one hell of a confusion with so many commits affecting
> each other and also affecting the symptoms of the problems in subtle
> ways. Finally, now since the original problematic commit (a66b2e5) has been
> completely reverted, revert this intermediate fix too (2f7021a), to fix the
> CPU hotplug deadlock. Phew!
>
> Reported-by: Sergey Senozhatsky <[email protected]>
> Reported-by: Bartlomiej Zolnierkiewicz <[email protected]>
> Signed-off-by: Srivatsa S. Bhat <[email protected]>
> ---
>
> drivers/cpufreq/cpufreq_governor.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 4645876..7b839a8 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -25,7 +25,6 @@
> #include <linux/slab.h>
> #include <linux/types.h>
> #include <linux/workqueue.h>
> -#include <linux/cpu.h>
>
> #include "cpufreq_governor.h"
>
> @@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> if (!all_cpus) {
> __gov_queue_work(smp_processor_id(), dbs_data, delay);
> } else {
> - get_online_cpus();
> for_each_cpu(i, policy->cpus)
> __gov_queue_work(i, dbs_data, delay);
> - put_online_cpus();
> }
> }
> EXPORT_SYMBOL_GPL(gov_queue_work);
>

2013-07-16 08:33:26

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

Hi Peter,

On 07/16/2013 02:19 AM, Peter Wu wrote:
> Hi,
>
> I think I also encountered this similar issue after resume (and possibly a
> real deadlock yesterday before/during suspend?). One message:
>
> [ 71.204848] ======================================================
> [ 71.204850] [ INFO: possible circular locking dependency detected ]
> [ 71.204852] 3.11.0-rc1cold-00008-g47188d3 #1 Tainted: G W
> [ 71.204854] -------------------------------------------------------
> [ 71.204855] ondemand/2034 is trying to acquire lock:
> [ 71.204857] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8104ba31>] get_online_cpus+0x41/0x60
> [ 71.204869]
> [ 71.204869] but task is already holding lock:
> [ 71.204870] (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<ffffffff8151fba9>] lock_policy_rwsem_write+0x39/0x40
> [ 71.204879]
> [ 71.204879] which lock already depends on the new lock.
> [ 71.204879]
> [ 71.204881]
> [ 71.204881] the existing dependency chain (in reverse order) is:
> [ 71.204884]
> -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}:
> [ 71.204889] [<ffffffff810ac130>] lock_acquire+0x90/0x140
> [ 71.204894] [<ffffffff81660fe9>] down_write+0x49/0x6b
> [ 71.204898] [<ffffffff8151fba9>] lock_policy_rwsem_write+0x39/0x40
> [ 71.204901] [<ffffffff815213e0>] cpufreq_update_policy+0x40/0x130
> [ 71.204904] [<ffffffff81522327>] cpufreq_stat_cpu_callback+0x27/0x70
> [ 71.204907] [<ffffffff81668acd>] notifier_call_chain+0x4d/0x70
> [ 71.204911] [<ffffffff8107730e>] __raw_notifier_call_chain+0xe/0x10
> [ 71.204915] [<ffffffff8104b780>] __cpu_notify+0x20/0x40
> [ 71.204918] [<ffffffff8104b916>] _cpu_up+0x116/0x170
> [ 71.204921] [<ffffffff8164d540>] enable_nonboot_cpus+0x90/0xe0
> [ 71.204926] [<ffffffff81098bd1>] suspend_devices_and_enter+0x301/0x420
> [ 71.204930] [<ffffffff81098ec0>] pm_suspend+0x1d0/0x230
> [ 71.205000] [<ffffffff81097b2a>] state_store+0x8a/0x100
> [ 71.205005] [<ffffffff8131559f>] kobj_attr_store+0xf/0x30
> [ 71.205009] [<ffffffff811fac36>] sysfs_write_file+0xe6/0x170
> [ 71.205014] [<ffffffff81183c5e>] vfs_write+0xce/0x200
> [ 71.205018] [<ffffffff81184165>] SyS_write+0x55/0xa0
> [ 71.205022] [<ffffffff8166d3c2>] system_call_fastpath+0x16/0x1b
> [ 71.205025]
> -> #0 (cpu_hotplug.lock){+.+.+.}:
> [ 71.205093] [<ffffffff810ab35c>] __lock_acquire+0x174c/0x1ed0
> [ 71.205096] [<ffffffff810ac130>] lock_acquire+0x90/0x140
> [ 71.205099] [<ffffffff8165f7b0>] mutex_lock_nested+0x70/0x380
> [ 71.205102] [<ffffffff8104ba31>] get_online_cpus+0x41/0x60
> [ 71.205217] [<ffffffff815247f8>] gov_queue_work+0x28/0xc0
> [ 71.205221] [<ffffffff81524d97>] cpufreq_governor_dbs+0x507/0x710
> [ 71.205224] [<ffffffff81522a17>] od_cpufreq_governor_dbs+0x17/0x20
> [ 71.205226] [<ffffffff8151fec7>] __cpufreq_governor+0x87/0x1c0
> [ 71.205230] [<ffffffff81520445>] __cpufreq_set_policy+0x1b5/0x1e0
> [ 71.205232] [<ffffffff8152055a>] store_scaling_governor+0xea/0x1f0
> [ 71.205235] [<ffffffff8151fcbd>] store+0x6d/0xc0
> [ 71.205238] [<ffffffff811fac36>] sysfs_write_file+0xe6/0x170
> [ 71.205305] [<ffffffff81183c5e>] vfs_write+0xce/0x200
> [ 71.205308] [<ffffffff81184165>] SyS_write+0x55/0xa0
> [ 71.205311] [<ffffffff8166d3c2>] system_call_fastpath+0x16/0x1b
> [ 71.205313]
> [ 71.205313] other info that might help us debug this:
> [ 71.205313]
> [ 71.205315] Possible unsafe locking scenario:
> [ 71.205315]
> [ 71.205317] CPU0 CPU1
> [ 71.205318] ---- ----
> [ 71.205383] lock(&per_cpu(cpu_policy_rwsem, cpu));
> [ 71.205386] lock(cpu_hotplug.lock);
> [ 71.205389] lock(&per_cpu(cpu_policy_rwsem, cpu));
> [ 71.205392] lock(cpu_hotplug.lock);
> [ 71.205509]
> [ 71.205509] *** DEADLOCK ***
> [ 71.205509]
> [ 71.205511] 4 locks held by ondemand/2034:
> [ 71.205512] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81183d63>] vfs_write+0x1d3/0x200
> [ 71.205520] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811fab94>] sysfs_write_file+0x44/0x170
> [ 71.205640] #2: (s_active#178){.+.+.+}, at: [<ffffffff811fac1d>] sysfs_write_file+0xcd/0x170
> [ 71.205648] #3: (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<ffffffff8151fba9>] lock_policy_rwsem_write+0x39/0x40
> [ 71.205655]
> [ 71.205655] stack backtrace:
> [ 71.205658] CPU: 1 PID: 2034 Comm: ondemand Tainted: G W 3.11.0-rc1cold-00008-g47188d3 #1
> [ 71.205660] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z68X-UD3H-B3, BIOS U1l 03/08/2013
> [ 71.205773] ffffffff8218fd20 ffff8805fc5d38e8 ffffffff8165b74d 0000000000000000
> [ 71.205778] ffffffff8211f130 ffff8805fc5d3938 ffffffff81657cef ffffffff8218fd20
> [ 71.205783] ffff8805fc5d39c0 ffff8805fc5d3938 ffff880603726678 ffff880603725f10
> [ 71.205900] Call Trace:
> [ 71.205903] [<ffffffff8165b74d>] dump_stack+0x55/0x76
> [ 71.205907] [<ffffffff81657cef>] print_circular_bug+0x1fb/0x20c
> [ 71.205910] [<ffffffff810ab35c>] __lock_acquire+0x174c/0x1ed0
> [ 71.205913] [<ffffffff810aa00c>] ? __lock_acquire+0x3fc/0x1ed0
> [ 71.205916] [<ffffffff8104ba31>] ? get_online_cpus+0x41/0x60
> [ 71.205919] [<ffffffff810ac130>] lock_acquire+0x90/0x140
> [ 71.205921] [<ffffffff8104ba31>] ? get_online_cpus+0x41/0x60
> [ 71.205924] [<ffffffff8165f7b0>] mutex_lock_nested+0x70/0x380
> [ 71.205927] [<ffffffff8104ba31>] ? get_online_cpus+0x41/0x60
> [ 71.205930] [<ffffffff810a89ee>] ? mark_held_locks+0x7e/0x150
> [ 71.205933] [<ffffffff81660b9e>] ? mutex_unlock+0xe/0x10
> [ 71.205936] [<ffffffff8165fb91>] ? __mutex_unlock_slowpath+0xd1/0x180
> [ 71.205938] [<ffffffff8104ba31>] get_online_cpus+0x41/0x60
> [ 71.205941] [<ffffffff815247f8>] gov_queue_work+0x28/0xc0
> [ 71.205944] [<ffffffff81524d97>] cpufreq_governor_dbs+0x507/0x710
> [ 71.205947] [<ffffffff81522a17>] od_cpufreq_governor_dbs+0x17/0x20
> [ 71.205950] [<ffffffff8151fec7>] __cpufreq_governor+0x87/0x1c0
> [ 71.206009] [<ffffffff81520445>] __cpufreq_set_policy+0x1b5/0x1e0
> [ 71.206012] [<ffffffff8152055a>] store_scaling_governor+0xea/0x1f0
> [ 71.206014] [<ffffffff815214d0>] ? cpufreq_update_policy+0x130/0x130
> [ 71.206018] [<ffffffff8151fba9>] ? lock_policy_rwsem_write+0x39/0x40
> [ 71.206021] [<ffffffff8151fcbd>] store+0x6d/0xc0
> [ 71.206024] [<ffffffff811fac36>] sysfs_write_file+0xe6/0x170
> [ 71.206026] [<ffffffff81183c5e>] vfs_write+0xce/0x200
> [ 71.206029] [<ffffffff81184165>] SyS_write+0x55/0xa0
> [ 71.206032] [<ffffffff8166d3c2>] system_call_fastpath+0x16/0x1b
>
> (the other was with the locks acquired lock reversed, i.e. cpu_hotplug.lock
> was held on CPU0 and CPU1 tries to lock &per_cpu(...)). This one is tagged
> "ondemand", the other one "pm-suspend" (reproducable with a high probability).
>

Hmm, this looks like a different problem, where a store (echo from sysfs) to
the scaling_governor file races with suspend/resume. Can you please open a
new thread and post the bug report? (Otherwise this thread will get even more
confusing if we start discussing separate problems all in one single email
thread.)

> On Monday 15 July 2013 18:49:39 Srivatsa S. Bhat wrote:
>> I think I finally found out what exactly is going wrong! :-)
>>
>> I tried reproducing the problem on my machine, and found that the problem
>> (warning about IPIs to offline CPUs) happens *only* while doing
>> suspend/resume and not during halt/shutdown/reboot or regular CPU hotplug
>> via sysfs files. That got me thinking and I finally figured out that commit
>> a66b2e5 is again the culprit.
>>
>> So here is the solution:
>>
>> On 3.11-rc1, apply these patches in the order mentioned below, and check
>> whether it fixes _all_ problems (both the warnings about IPI as well as the
>> lockdep splat).
>>
>> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
>> (Just apply patch 1, not the entire patchset).
>>
>> 2. Apply the patch shown below, on top of the above patch:
>>
>> ---------------------------------------------------------------------------
>>
>>
>> From: Srivatsa S. Bhat <[email protected]>
>> Subject: [PATCH] cpufreq: Revert commit 2f7021a to fix CPU hotplug
>> regression
>
> Please use '2f7021a8', without the '8' the commit hash is ambiguous.
> (git describe says: v3.10-rc4-2-g2f7021a8)
>

Yeah, even I noticed it after I sent out the patch when I was trying to look
up that commit for some other reason. Thanks for pointing that out!

> I ran six times `pm-suspend` without any lockdep warnings. I reverted a66b2e5
> and 2f7021a8 on top of current master (47188d3).
>

Cool! Thanks for testing!

Regards,
Srivatsa S. Bhat

>
>> commit 2f7021a (cpufreq: protect 'policy->cpus' from offlining during
>> __gov_queue_work()) caused a regression in CPU hotplug, because it lead
>> to a deadlock between cpufreq governor worker thread and the CPU hotplug
>> writer task.
>>
>> Lockdep splat corresponding to this deadlock is shown below:
>>
>> [ 60.277396] ======================================================
>> [ 60.277400] [ INFO: possible circular locking dependency detected ]
>> [ 60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
>> [ 60.277411] -------------------------------------------------------
>> [ 60.277417] bash/2225 is trying to acquire lock:
>> [ 60.277422] ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>]
>> flush_work+0x5/0x280 [ 60.277444] but task is already holding lock:
>> [ 60.277449] (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>]
>> cpu_hotplug_begin+0x2b/0x60 [ 60.277465] which lock already depends on
>> the new lock.
>>
>> [ 60.277472] the existing dependency chain (in reverse order) is:
>> [ 60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
>> [ 60.277490] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
>> [ 60.277503] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
>> [ 60.277514] [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
>> [ 60.277522] [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
>> [ 60.277532] [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
>> [ 60.277543] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
>> [ 60.277552] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
>> [ 60.277560] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
>> [ 60.277569] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
>> [ 60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
>> [ 60.277592] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
>> [ 60.277600] [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
>> [ 60.277608] [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
>> [ 60.277616] [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
>> [ 60.277624] [<ffffffff81063d31>] worker_thread+0x121/0x3a0
>> [ 60.277633] [<ffffffff8106ae2b>] kthread+0xdb/0xe0
>> [ 60.277640] [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
>> [ 60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
>> [ 60.277661] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
>> [ 60.277669] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
>> [ 60.277677] [<ffffffff810621ed>] flush_work+0x3d/0x280
>> [ 60.277685] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
>> [ 60.277693] [<ffffffff81062e53>]
>> cancel_delayed_work_sync+0x13/0x20 [ 60.277701]
>> [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0 [ 60.277709]
>> [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20 [ 60.277719]
>> [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100 [ 60.277728]
>> [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0 [ 60.277737]
>> [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c [ 60.277747]
>> [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110 [ 60.277759]
>> [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10 [ 60.277768]
>> [<ffffffff815a0a68>] _cpu_down+0x88/0x330
>> [ 60.277779] [<ffffffff815a0d46>] cpu_down+0x36/0x50
>> [ 60.277788] [<ffffffff815a2748>] store_online+0x98/0xd0
>> [ 60.277796] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
>> [ 60.277806] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
>> [ 60.277818] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
>> [ 60.277826] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
>> [ 60.277834] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
>> [ 60.277842] other info that might help us debug this:
>>
>> [ 60.277848] Chain exists of:
>> (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock
>>
>> [ 60.277864] Possible unsafe locking scenario:
>>
>> [ 60.277869] CPU0 CPU1
>> [ 60.277873] ---- ----
>> [ 60.277877] lock(cpu_hotplug.lock);
>> [ 60.277885] lock(&j_cdbs->timer_mutex);
>> [ 60.277892] lock(cpu_hotplug.lock);
>> [ 60.277900] lock((&(&j_cdbs->work)->work));
>> [ 60.277907] *** DEADLOCK ***
>>
>> [ 60.277915] 6 locks held by bash/2225:
>> [ 60.277919] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>]
>> vfs_write+0x1c3/0x1f0 [ 60.277937] #1: (&buffer->mutex){+.+.+.}, at:
>> [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150 [ 60.277954] #2:
>> (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
>> [ 60.277972] #3: (x86_cpu_hotplug_driver_mutex){+.+...}, at:
>> [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20 [ 60.277990] #4:
>> (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
>> [ 60.278007] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>]
>> cpu_hotplug_begin+0x2b/0x60 [ 60.278023] stack backtrace:
>> [ 60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted
>> 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 [ 60.278037] Hardware name:
>> Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011 [
>> 60.278042] ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90
>> ffff88014df6ba38 [ 60.278055] ffffffff815b0a8d ffff880150ed3f60
>> ffff880150ed4770 3871c4002c8980b2 [ 60.278068] ffff880150ed4748
>> ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00 [ 60.278081] Call
>> Trace:
>> [ 60.278091] [<ffffffff815b3d90>] dump_stack+0x19/0x1b
>> [ 60.278101] [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
>> [ 60.278111] [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
>> [ 60.278123] [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
>> [ 60.278134] [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
>> [ 60.278142] [<ffffffff810621b5>] ? flush_work+0x5/0x280
>> [ 60.278151] [<ffffffff810621ed>] flush_work+0x3d/0x280
>> [ 60.278159] [<ffffffff810621b5>] ? flush_work+0x5/0x280
>> [ 60.278169] [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
>> [ 60.278178] [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
>> [ 60.278188] [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
>> [ 60.278196] [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
>> [ 60.278206] [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
>> [ 60.278214] [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
>> [ 60.278225] [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
>> [ 60.278234] [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
>> [ 60.278244] [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
>> [ 60.278255] [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c [
>> 60.278265] [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110 [
>> 60.278275] [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10 [
>> 60.278284] [<ffffffff815a0a68>] _cpu_down+0x88/0x330
>> [ 60.278292] [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
>> [ 60.278302] [<ffffffff815a0d46>] cpu_down+0x36/0x50
>> [ 60.278311] [<ffffffff815a2748>] store_online+0x98/0xd0
>> [ 60.278320] [<ffffffff81452a28>] dev_attr_store+0x18/0x30
>> [ 60.278329] [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
>> [ 60.278337] [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
>> [ 60.278347] [<ffffffff81185950>] ? fget_light+0x320/0x4b0
>> [ 60.278355] [<ffffffff811686fc>] SyS_write+0x4c/0xa0
>> [ 60.278364] [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
>> [ 60.280582] smpboot: CPU 1 is now offline
>>
>>
>> The intent of this commit was to avoid warnings during CPU hotplug, which
>> indicated that offline CPUs were getting IPIs from the cpufreq governor's
>> work items. But the real root-cause of that problem was commit a66b2e5
>> (cpufreq: Preserve sysfs files across suspend/resume) because it totally
>> skipped all the cpufreq callbacks during CPU hotplug in the suspend/resume
>> path, and hence it never actually shut down the cpufreq governor's worker
>> threads during CPU offline in the suspend/resume path.
>>
>> Reflecting back, the reason why we never suspected that commit as the
>> root-cause earlier, was that the original issue was reported with just the
>> halt command and nobody had brought in suspend/resume to the equation.
>>
>> The reason for _that_ in turn, it turns out is that, earlier halt/shutdown
>> was being done by disabling non-boot CPUs while tasks were frozen, just like
>> suspend/resume.... but commit cf7df378a (reboot: rigrate shutdown/reboot
>> to boot cpu) which came somewhere along that very same time changed that
>> logic: shutdown/halt no longer takes CPUs offline.
>> Thus, the test-cases for reproducing the bug were vastly different and thus
>> we went totally off the trail.
>>
>> Overall, it was one hell of a confusion with so many commits affecting
>> each other and also affecting the symptoms of the problems in subtle
>> ways. Finally, now since the original problematic commit (a66b2e5) has been
>> completely reverted, revert this intermediate fix too (2f7021a), to fix the
>> CPU hotplug deadlock. Phew!
>>
>> Reported-by: Sergey Senozhatsky <[email protected]>
>> Reported-by: Bartlomiej Zolnierkiewicz <[email protected]>
>> Signed-off-by: Srivatsa S. Bhat <[email protected]>
>> ---
>>
>> drivers/cpufreq/cpufreq_governor.c | 3 ---
>> 1 file changed, 3 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c
>> b/drivers/cpufreq/cpufreq_governor.c index 4645876..7b839a8 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -25,7 +25,6 @@
>> #include <linux/slab.h>
>> #include <linux/types.h>
>> #include <linux/workqueue.h>
>> -#include <linux/cpu.h>
>>
>> #include "cpufreq_governor.h"
>>
>> @@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct
>> cpufreq_policy *policy, if (!all_cpus) {
>> __gov_queue_work(smp_processor_id(), dbs_data, delay);
>> } else {
>> - get_online_cpus();
>> for_each_cpu(i, policy->cpus)
>> __gov_queue_work(i, dbs_data, delay);
>> - put_online_cpus();
>> }
>> }
>> EXPORT_SYMBOL_GPL(gov_queue_work);
>
>

2013-07-16 08:37:32

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/16/2013 04:50 AM, Sergey Senozhatsky wrote:
> On (07/15/13 18:49), Srivatsa S. Bhat wrote:
> [..]
>> So here is the solution:
>>
>> On 3.11-rc1, apply these patches in the order mentioned below, and check
>> whether it fixes _all_ problems (both the warnings about IPI as well as the
>> lockdep splat).
>>
>> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
>> (Just apply patch 1, not the entire patchset).
>>
>> 2. Apply the patch shown below, on top of the above patch:
>>
>> ---------------------------------------------------------------------------
>>
>
> Hello Srivatsa,
> Thanks, I'll test a bit later -- in the morning. (laptop stopped resuming from
> suspend, probably radeon dmp).
>
>

Sure, thanks!

>
> Shouldn't we also kick the console lock?
>
>
> kernel/printk.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/printk.c b/kernel/printk.c
> index d37d45c..3e20233 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -1926,8 +1926,11 @@ static int __cpuinit console_cpu_notify(struct notifier_block *self,
> {
> switch (action) {
> case CPU_ONLINE:
> + case CPU_ONLINE_FROZEN:
> case CPU_DEAD:
> + case CPU_DEAD_FROZEN:
> case CPU_DOWN_FAILED:
> + case CPU_DOWN_FAILED_FROZEN:
> case CPU_UP_CANCELED:
> console_lock();
> console_unlock();
>
>

No need. suspend_console() and resume_console() already handle it
properly in the suspend/resume case, from what I can see.

Regards,
Srivatsa S. Bhat

2013-07-16 10:44:37

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On (07/16/13 14:03), Srivatsa S. Bhat wrote:
> >> So here is the solution:
> >>
> >> On 3.11-rc1, apply these patches in the order mentioned below, and check
> >> whether it fixes _all_ problems (both the warnings about IPI as well as the
> >> lockdep splat).
> >>
> >> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
> >> (Just apply patch 1, not the entire patchset).
> >>
> >> 2. Apply the patch shown below, on top of the above patch:
> >>
> >> ---------------------------------------------------------------------------
> >>
> >
> > Hello Srivatsa,
> > Thanks, I'll test a bit later -- in the morning. (laptop stopped resuming from
> > suspend, probably radeon dmp).
> >
> >
>
> Sure, thanks!
>
> >
> > Shouldn't we also kick the console lock?
> >
> >
> > kernel/printk.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/kernel/printk.c b/kernel/printk.c
> > index d37d45c..3e20233 100644
> > --- a/kernel/printk.c
> > +++ b/kernel/printk.c
> > @@ -1926,8 +1926,11 @@ static int __cpuinit console_cpu_notify(struct notifier_block *self,
> > {
> > switch (action) {
> > case CPU_ONLINE:
> > + case CPU_ONLINE_FROZEN:
> > case CPU_DEAD:
> > + case CPU_DEAD_FROZEN:
> > case CPU_DOWN_FAILED:
> > + case CPU_DOWN_FAILED_FROZEN:
> > case CPU_UP_CANCELED:
> > console_lock();
> > console_unlock();
> >
> >
>
> No need. suspend_console() and resume_console() already handle it
> properly in the suspend/resume case, from what I can see.
>

I've managed to wake up my laptop from suspend, and something's not right.


# for i in {1..5}; do \
echo 0 > /sys/devices/system/cpu/cpu3/online; \
echo 0 > /sys/devices/system/cpu/cpu2/online; \
echo 1 > /sys/devices/system/cpu/cpu3/online; \
echo 0 > /sys/devices/system/cpu/cpu1/online; \
echo 1 > /sys/devices/system/cpu/cpu1/online; \
echo 1 > /sys/devices/system/cpu/cpu2/online; \
done
# systemctl suspend
-> resume


[ 227.329656] ACPI: Preparing to enter system sleep state S3
[ 227.353334] PM: Saving platform NVS memory

[ 227.355403] ======================================================
[ 227.355404] [ INFO: possible circular locking dependency detected ]
[ 227.355407] 3.11.0-rc1-dbg-01398-gf537e41-dirty #1838 Not tainted
[ 227.355408] -------------------------------------------------------
[ 227.355411] systemd-sleep/2280 is trying to acquire lock:
[ 227.355426] (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8104dab4>]
disable_nonboot_cpus+0x24/0x120
[ 227.355427]
but task is already holding lock:
[ 227.355434] (console_lock){+.+.+.}, at: [<ffffffff8104c956>]
suspend_console+0x26/0x40
[ 227.355435]
which lock already depends on the new lock.

[ 227.355436]
the existing dependency chain (in reverse order) is:
[ 227.355441]
-> #2 (console_lock){+.+.+.}:
[ 227.355448] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
[ 227.355452] [<ffffffff8104b197>] console_lock+0x77/0x80
[ 227.355456] [<ffffffff8104cf91>] console_cpu_notify+0x31/0x40
[ 227.355462] [<ffffffff8107cd6d>] notifier_call_chain+0x5d/0x110
[ 227.355466] [<ffffffff8107ce2e>] __raw_notifier_call_chain+0xe/0x10
[ 227.355469] [<ffffffff8104d5a3>] cpu_notify+0x23/0x50
[ 227.355473] [<ffffffff8104d5de>] cpu_notify_nofail+0xe/0x20
[ 227.355482] [<ffffffff815fafad>] _cpu_down+0x1ad/0x330
[ 227.355486] [<ffffffff815fb166>] cpu_down+0x36/0x50
[ 227.355493] [<ffffffff814ad8cd>] cpu_subsys_offline+0x1d/0x30
[ 227.355498] [<ffffffff814a8de5>] device_offline+0x95/0xc0
[ 227.355502] [<ffffffff814a8ee2>] store_online+0x42/0x90
[ 227.355506] [<ffffffff814a64f8>] dev_attr_store+0x18/0x30
[ 227.355513] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
[ 227.355517] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
[ 227.355522] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
[ 227.355527] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
[ 227.355531]
-> #1 (cpu_hotplug.lock){+.+.+.}:
[ 227.355535] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
[ 227.355541] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
[ 227.355545] [<ffffffff8104d54b>] cpu_hotplug_begin+0x2b/0x60
[ 227.355549] [<ffffffff8104d61a>] _cpu_up+0x2a/0x170
[ 227.355552] [<ffffffff8104d7b9>] cpu_up+0x59/0x80
[ 227.355558] [<ffffffff81cf51b6>] smp_init+0x64/0x95
[ 227.355566] [<ffffffff81cdaf21>] kernel_init_freeable+0x84/0x191
[ 227.355570] [<ffffffff815fa34e>] kernel_init+0xe/0x180
[ 227.355574] [<ffffffff8160c9ac>] ret_from_fork+0x7c/0xb0
[ 227.355578]
-> #0 (cpu_add_remove_lock){+.+.+.}:
[ 227.355582] [<ffffffff810b8106>] __lock_acquire+0x1766/0x1d30
[ 227.355586] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
[ 227.355590] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
[ 227.355594] [<ffffffff8104dab4>] disable_nonboot_cpus+0x24/0x120
[ 227.355601] [<ffffffff810a0973>] suspend_devices_and_enter+0x1f3/0x680
[ 227.355605] [<ffffffff810a0fd2>] pm_suspend+0x1d2/0x240
[ 227.355609] [<ffffffff8109fa19>] state_store+0x79/0xf0
[ 227.355614] [<ffffffff81312dbf>] kobj_attr_store+0xf/0x20
[ 227.355618] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
[ 227.355621] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
[ 227.355624] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
[ 227.355628] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
[ 227.355629]
other info that might help us debug this:

[ 227.355635] Chain exists of:
cpu_add_remove_lock --> cpu_hotplug.lock --> console_lock

[ 227.355637] Possible unsafe locking scenario:

[ 227.355638] CPU0 CPU1
[ 227.355639] ---- ----
[ 227.355642] lock(console_lock);
[ 227.355644] lock(cpu_hotplug.lock);
[ 227.355647] lock(console_lock);
[ 227.355650] lock(cpu_add_remove_lock);
[ 227.355651]
*** DEADLOCK ***

[ 227.355653] 5 locks held by systemd-sleep/2280:
[ 227.355661] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff8117a78b>] vfs_write+0x1bb/0x1e0
[ 227.355668] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811ec67c>] sysfs_write_file+0x3c/0x150
[ 227.355676] #2: (s_active#110){.+.+.+}, at: [<ffffffff811ec703>] sysfs_write_file+0xc3/0x150
[ 227.355683] #3: (pm_mutex){+.+.+.}, at: [<ffffffff810a0e32>] pm_suspend+0x32/0x240
[ 227.355690] #4: (console_lock){+.+.+.}, at: [<ffffffff8104c956>] suspend_console+0x26/0x40
[ 227.355691]
stack backtrace:
[ 227.355695] CPU: 0 PID: 2280 Comm: systemd-sleep Not tainted 3.11.0-rc1-dbg-01398-gf537e41-dirty #1838
[ 227.355697] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
[ 227.355703] ffffffff82208680 ffff88015151bbc8 ffffffff81603038 ffffffff822073f0
[ 227.355707] ffff88015151bc08 ffffffff815ffdaa ffff880153389fa0 ffff88015338a788
[ 227.355712] 1d81e4832c04c441 ffff88015338a760 ffff88015338a788 ffff880153389fa0
[ 227.355713] Call Trace:
[ 227.355719] [<ffffffff81603038>] dump_stack+0x4e/0x82
[ 227.355723] [<ffffffff815ffdaa>] print_circular_bug+0x2b6/0x2c5
[ 227.355727] [<ffffffff810b8106>] __lock_acquire+0x1766/0x1d30
[ 227.355733] [<ffffffff81054aec>] ? walk_system_ram_range+0x5c/0x140
[ 227.355737] [<ffffffff810b63f4>] ? mark_held_locks+0x94/0x140
[ 227.355741] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
[ 227.355745] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
[ 227.355749] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
[ 227.355753] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
[ 227.355757] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
[ 227.355761] [<ffffffff81607e9e>] ? mutex_unlock+0xe/0x10
[ 227.355768] [<ffffffff8135bd2f>] ? acpi_os_get_iomem+0x4c/0x54
[ 227.355772] [<ffffffff8104dab4>] disable_nonboot_cpus+0x24/0x120
[ 227.355777] [<ffffffff810a0973>] suspend_devices_and_enter+0x1f3/0x680
[ 227.355780] [<ffffffff815fefc6>] ? printk+0x67/0x69
[ 227.355785] [<ffffffff810a0fd2>] pm_suspend+0x1d2/0x240
[ 227.355789] [<ffffffff8109fa19>] state_store+0x79/0xf0
[ 227.355792] [<ffffffff81312dbf>] kobj_attr_store+0xf/0x20
[ 227.355796] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
[ 227.355799] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
[ 227.355804] [<ffffffff81198490>] ? fget_light+0x320/0x4b0
[ 227.355808] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
[ 227.355811] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
[ 227.355814] Disabling non-boot CPUs ...
[ 227.357731] smpboot: CPU 1 is now offline
[ 227.461072] smpboot: CPU 2 is now offline
[ 227.565119] smpboot: CPU 3 is now offline



Just to make sure I didn't miss anything:

git diff -u -p drivers/cpufreq/

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 0937b8d..7dcfa68 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1942,13 +1942,15 @@ static int __cpuinit cpufreq_cpu_callback(struct notifier_block *nfb,
if (dev) {
switch (action) {
case CPU_ONLINE:
+ case CPU_ONLINE_FROZEN:
cpufreq_add_dev(dev, NULL);
break;
case CPU_DOWN_PREPARE:
- case CPU_UP_CANCELED_FROZEN:
+ case CPU_DOWN_PREPARE_FROZEN:
__cpufreq_remove_dev(dev, NULL);
break;
case CPU_DOWN_FAILED:
+ case CPU_DOWN_FAILED_FROZEN:
cpufreq_add_dev(dev, NULL);
break;
}
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 4645876..7b839a8 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -25,7 +25,6 @@
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/workqueue.h>
-#include <linux/cpu.h>

#include "cpufreq_governor.h"

@@ -137,10 +136,8 @@ void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
if (!all_cpus) {
__gov_queue_work(smp_processor_id(), dbs_data, delay);
} else {
- get_online_cpus();
for_each_cpu(i, policy->cpus)
__gov_queue_work(i, dbs_data, delay);
- put_online_cpus();
}
}
EXPORT_SYMBOL_GPL(gov_queue_work);
diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index cd9e817..12225d1 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -353,13 +353,11 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
cpufreq_update_policy(cpu);
break;
case CPU_DOWN_PREPARE:
+ case CPU_DOWN_PREPARE_FROZEN:
cpufreq_stats_free_sysfs(cpu);
break;
case CPU_DEAD:
- cpufreq_stats_free_table(cpu);
- break;
- case CPU_UP_CANCELED_FROZEN:
- cpufreq_stats_free_sysfs(cpu);
+ case CPU_DEAD_FROZEN:
cpufreq_stats_free_table(cpu);
break;
}


-ss

2013-07-16 15:23:09

by Srivatsa S. Bhat

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On 07/16/2013 04:14 PM, Sergey Senozhatsky wrote:
> On (07/16/13 14:03), Srivatsa S. Bhat wrote:
>>>> So here is the solution:
>>>>
>>>> On 3.11-rc1, apply these patches in the order mentioned below, and check
>>>> whether it fixes _all_ problems (both the warnings about IPI as well as the
>>>> lockdep splat).
>>>>
>>>> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
>>>> (Just apply patch 1, not the entire patchset).
>>>>
>>>> 2. Apply the patch shown below, on top of the above patch:
>>>>
>>>> ---------------------------------------------------------------------------
>>>>
>>>
>>> Hello Srivatsa,
>>> Thanks, I'll test a bit later -- in the morning. (laptop stopped resuming from
>>> suspend, probably radeon dmp).
>>>
>>>
>>
>> Sure, thanks!
>>
>>>
>>> Shouldn't we also kick the console lock?
>>>
>>>
>>> kernel/printk.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/kernel/printk.c b/kernel/printk.c
>>> index d37d45c..3e20233 100644
>>> --- a/kernel/printk.c
>>> +++ b/kernel/printk.c
>>> @@ -1926,8 +1926,11 @@ static int __cpuinit console_cpu_notify(struct notifier_block *self,
>>> {
>>> switch (action) {
>>> case CPU_ONLINE:
>>> + case CPU_ONLINE_FROZEN:
>>> case CPU_DEAD:
>>> + case CPU_DEAD_FROZEN:
>>> case CPU_DOWN_FAILED:
>>> + case CPU_DOWN_FAILED_FROZEN:
>>> case CPU_UP_CANCELED:
>>> console_lock();
>>> console_unlock();
>>>
>>>
>>
>> No need. suspend_console() and resume_console() already handle it
>> properly in the suspend/resume case, from what I can see.
>>
>
> I've managed to wake up my laptop from suspend, and something's not right.
>
>
> # for i in {1..5}; do \
> echo 0 > /sys/devices/system/cpu/cpu3/online; \
> echo 0 > /sys/devices/system/cpu/cpu2/online; \
> echo 1 > /sys/devices/system/cpu/cpu3/online; \
> echo 0 > /sys/devices/system/cpu/cpu1/online; \
> echo 1 > /sys/devices/system/cpu/cpu1/online; \
> echo 1 > /sys/devices/system/cpu/cpu2/online; \
> done
> # systemctl suspend
> -> resume
>
>
> [ 227.329656] ACPI: Preparing to enter system sleep state S3
> [ 227.353334] PM: Saving platform NVS memory
>
> [ 227.355403] ======================================================
> [ 227.355404] [ INFO: possible circular locking dependency detected ]
> [ 227.355407] 3.11.0-rc1-dbg-01398-gf537e41-dirty #1838 Not tainted
> [ 227.355408] -------------------------------------------------------
> [ 227.355411] systemd-sleep/2280 is trying to acquire lock:
> [ 227.355426] (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8104dab4>]
> disable_nonboot_cpus+0x24/0x120
> [ 227.355427]
> but task is already holding lock:
> [ 227.355434] (console_lock){+.+.+.}, at: [<ffffffff8104c956>]
> suspend_console+0x26/0x40
> [ 227.355435]
> which lock already depends on the new lock.
>
> [ 227.355436]
> the existing dependency chain (in reverse order) is:
> [ 227.355441]
> -> #2 (console_lock){+.+.+.}:
> [ 227.355448] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> [ 227.355452] [<ffffffff8104b197>] console_lock+0x77/0x80
> [ 227.355456] [<ffffffff8104cf91>] console_cpu_notify+0x31/0x40
> [ 227.355462] [<ffffffff8107cd6d>] notifier_call_chain+0x5d/0x110
> [ 227.355466] [<ffffffff8107ce2e>] __raw_notifier_call_chain+0xe/0x10
> [ 227.355469] [<ffffffff8104d5a3>] cpu_notify+0x23/0x50
> [ 227.355473] [<ffffffff8104d5de>] cpu_notify_nofail+0xe/0x20
> [ 227.355482] [<ffffffff815fafad>] _cpu_down+0x1ad/0x330
> [ 227.355486] [<ffffffff815fb166>] cpu_down+0x36/0x50
> [ 227.355493] [<ffffffff814ad8cd>] cpu_subsys_offline+0x1d/0x30
> [ 227.355498] [<ffffffff814a8de5>] device_offline+0x95/0xc0
> [ 227.355502] [<ffffffff814a8ee2>] store_online+0x42/0x90
> [ 227.355506] [<ffffffff814a64f8>] dev_attr_store+0x18/0x30
> [ 227.355513] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
> [ 227.355517] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
> [ 227.355522] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
> [ 227.355527] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
> [ 227.355531]
> -> #1 (cpu_hotplug.lock){+.+.+.}:
> [ 227.355535] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> [ 227.355541] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
> [ 227.355545] [<ffffffff8104d54b>] cpu_hotplug_begin+0x2b/0x60
> [ 227.355549] [<ffffffff8104d61a>] _cpu_up+0x2a/0x170
> [ 227.355552] [<ffffffff8104d7b9>] cpu_up+0x59/0x80
> [ 227.355558] [<ffffffff81cf51b6>] smp_init+0x64/0x95
> [ 227.355566] [<ffffffff81cdaf21>] kernel_init_freeable+0x84/0x191
> [ 227.355570] [<ffffffff815fa34e>] kernel_init+0xe/0x180
> [ 227.355574] [<ffffffff8160c9ac>] ret_from_fork+0x7c/0xb0
> [ 227.355578]
> -> #0 (cpu_add_remove_lock){+.+.+.}:
> [ 227.355582] [<ffffffff810b8106>] __lock_acquire+0x1766/0x1d30
> [ 227.355586] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> [ 227.355590] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
> [ 227.355594] [<ffffffff8104dab4>] disable_nonboot_cpus+0x24/0x120
> [ 227.355601] [<ffffffff810a0973>] suspend_devices_and_enter+0x1f3/0x680
> [ 227.355605] [<ffffffff810a0fd2>] pm_suspend+0x1d2/0x240
> [ 227.355609] [<ffffffff8109fa19>] state_store+0x79/0xf0
> [ 227.355614] [<ffffffff81312dbf>] kobj_attr_store+0xf/0x20
> [ 227.355618] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
> [ 227.355621] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
> [ 227.355624] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
> [ 227.355628] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
> [ 227.355629]
> other info that might help us debug this:
>
> [ 227.355635] Chain exists of:
> cpu_add_remove_lock --> cpu_hotplug.lock --> console_lock
>
> [ 227.355637] Possible unsafe locking scenario:
>
> [ 227.355638] CPU0 CPU1
> [ 227.355639] ---- ----
> [ 227.355642] lock(console_lock);
> [ 227.355644] lock(cpu_hotplug.lock);
> [ 227.355647] lock(console_lock);
> [ 227.355650] lock(cpu_add_remove_lock);
> [ 227.355651]
> *** DEADLOCK ***
>
> [ 227.355653] 5 locks held by systemd-sleep/2280:
> [ 227.355661] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff8117a78b>] vfs_write+0x1bb/0x1e0
> [ 227.355668] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811ec67c>] sysfs_write_file+0x3c/0x150
> [ 227.355676] #2: (s_active#110){.+.+.+}, at: [<ffffffff811ec703>] sysfs_write_file+0xc3/0x150
> [ 227.355683] #3: (pm_mutex){+.+.+.}, at: [<ffffffff810a0e32>] pm_suspend+0x32/0x240
> [ 227.355690] #4: (console_lock){+.+.+.}, at: [<ffffffff8104c956>] suspend_console+0x26/0x40
> [ 227.355691]
> stack backtrace:
> [ 227.355695] CPU: 0 PID: 2280 Comm: systemd-sleep Not tainted 3.11.0-rc1-dbg-01398-gf537e41-dirty #1838
> [ 227.355697] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 227.355703] ffffffff82208680 ffff88015151bbc8 ffffffff81603038 ffffffff822073f0
> [ 227.355707] ffff88015151bc08 ffffffff815ffdaa ffff880153389fa0 ffff88015338a788
> [ 227.355712] 1d81e4832c04c441 ffff88015338a760 ffff88015338a788 ffff880153389fa0
> [ 227.355713] Call Trace:
> [ 227.355719] [<ffffffff81603038>] dump_stack+0x4e/0x82
> [ 227.355723] [<ffffffff815ffdaa>] print_circular_bug+0x2b6/0x2c5
> [ 227.355727] [<ffffffff810b8106>] __lock_acquire+0x1766/0x1d30
> [ 227.355733] [<ffffffff81054aec>] ? walk_system_ram_range+0x5c/0x140
> [ 227.355737] [<ffffffff810b63f4>] ? mark_held_locks+0x94/0x140
> [ 227.355741] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> [ 227.355745] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
> [ 227.355749] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
> [ 227.355753] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
> [ 227.355757] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
> [ 227.355761] [<ffffffff81607e9e>] ? mutex_unlock+0xe/0x10
> [ 227.355768] [<ffffffff8135bd2f>] ? acpi_os_get_iomem+0x4c/0x54
> [ 227.355772] [<ffffffff8104dab4>] disable_nonboot_cpus+0x24/0x120
> [ 227.355777] [<ffffffff810a0973>] suspend_devices_and_enter+0x1f3/0x680
> [ 227.355780] [<ffffffff815fefc6>] ? printk+0x67/0x69
> [ 227.355785] [<ffffffff810a0fd2>] pm_suspend+0x1d2/0x240
> [ 227.355789] [<ffffffff8109fa19>] state_store+0x79/0xf0
> [ 227.355792] [<ffffffff81312dbf>] kobj_attr_store+0xf/0x20
> [ 227.355796] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
> [ 227.355799] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
> [ 227.355804] [<ffffffff81198490>] ? fget_light+0x320/0x4b0
> [ 227.355808] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
> [ 227.355811] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
> [ 227.355814] Disabling non-boot CPUs ...
> [ 227.357731] smpboot: CPU 1 is now offline
> [ 227.461072] smpboot: CPU 2 is now offline
> [ 227.565119] smpboot: CPU 3 is now offline
>
>

This also looks like a different issue altogether, and IMHO deserves
attention in a separate, dedicated email thread. Can you post it in a
new thread please?

Also, since you didn't get the original lockdep warning you reported,
and since you didn't hit the IPI-to-offline-cpus warnings as well, I
think we can safely conclude that my patches fixed your original problem.

Rafael, could you kindly pick up this second patch[2] as well (with CC
to stable)? (I'm aware that you already picked up the first one[1]).

Thanks a lot!

Regards,
Srivatsa S. Bhat

[1]. https://lkml.org/lkml/2013/7/11/661
[2]. http://marc.info/?l=linux-kernel&m=137389460805002&w=2

2013-07-16 21:19:34

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [LOCKDEP] cpufreq: possible circular locking dependency detected

On Tuesday, July 16, 2013 08:49:30 PM Srivatsa S. Bhat wrote:
> On 07/16/2013 04:14 PM, Sergey Senozhatsky wrote:
> > On (07/16/13 14:03), Srivatsa S. Bhat wrote:
> >>>> So here is the solution:
> >>>>
> >>>> On 3.11-rc1, apply these patches in the order mentioned below, and check
> >>>> whether it fixes _all_ problems (both the warnings about IPI as well as the
> >>>> lockdep splat).
> >>>>
> >>>> 1. Patch given in: https://lkml.org/lkml/2013/7/11/661
> >>>> (Just apply patch 1, not the entire patchset).
> >>>>
> >>>> 2. Apply the patch shown below, on top of the above patch:
> >>>>
> >>>> ---------------------------------------------------------------------------
> >>>>
> >>>
> >>> Hello Srivatsa,
> >>> Thanks, I'll test a bit later -- in the morning. (laptop stopped resuming from
> >>> suspend, probably radeon dmp).
> >>>
> >>>
> >>
> >> Sure, thanks!
> >>
> >>>
> >>> Shouldn't we also kick the console lock?
> >>>
> >>>
> >>> kernel/printk.c | 3 +++
> >>> 1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/kernel/printk.c b/kernel/printk.c
> >>> index d37d45c..3e20233 100644
> >>> --- a/kernel/printk.c
> >>> +++ b/kernel/printk.c
> >>> @@ -1926,8 +1926,11 @@ static int __cpuinit console_cpu_notify(struct notifier_block *self,
> >>> {
> >>> switch (action) {
> >>> case CPU_ONLINE:
> >>> + case CPU_ONLINE_FROZEN:
> >>> case CPU_DEAD:
> >>> + case CPU_DEAD_FROZEN:
> >>> case CPU_DOWN_FAILED:
> >>> + case CPU_DOWN_FAILED_FROZEN:
> >>> case CPU_UP_CANCELED:
> >>> console_lock();
> >>> console_unlock();
> >>>
> >>>
> >>
> >> No need. suspend_console() and resume_console() already handle it
> >> properly in the suspend/resume case, from what I can see.
> >>
> >
> > I've managed to wake up my laptop from suspend, and something's not right.
> >
> >
> > # for i in {1..5}; do \
> > echo 0 > /sys/devices/system/cpu/cpu3/online; \
> > echo 0 > /sys/devices/system/cpu/cpu2/online; \
> > echo 1 > /sys/devices/system/cpu/cpu3/online; \
> > echo 0 > /sys/devices/system/cpu/cpu1/online; \
> > echo 1 > /sys/devices/system/cpu/cpu1/online; \
> > echo 1 > /sys/devices/system/cpu/cpu2/online; \
> > done
> > # systemctl suspend
> > -> resume
> >
> >
> > [ 227.329656] ACPI: Preparing to enter system sleep state S3
> > [ 227.353334] PM: Saving platform NVS memory
> >
> > [ 227.355403] ======================================================
> > [ 227.355404] [ INFO: possible circular locking dependency detected ]
> > [ 227.355407] 3.11.0-rc1-dbg-01398-gf537e41-dirty #1838 Not tainted
> > [ 227.355408] -------------------------------------------------------
> > [ 227.355411] systemd-sleep/2280 is trying to acquire lock:
> > [ 227.355426] (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8104dab4>]
> > disable_nonboot_cpus+0x24/0x120
> > [ 227.355427]
> > but task is already holding lock:
> > [ 227.355434] (console_lock){+.+.+.}, at: [<ffffffff8104c956>]
> > suspend_console+0x26/0x40
> > [ 227.355435]
> > which lock already depends on the new lock.
> >
> > [ 227.355436]
> > the existing dependency chain (in reverse order) is:
> > [ 227.355441]
> > -> #2 (console_lock){+.+.+.}:
> > [ 227.355448] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> > [ 227.355452] [<ffffffff8104b197>] console_lock+0x77/0x80
> > [ 227.355456] [<ffffffff8104cf91>] console_cpu_notify+0x31/0x40
> > [ 227.355462] [<ffffffff8107cd6d>] notifier_call_chain+0x5d/0x110
> > [ 227.355466] [<ffffffff8107ce2e>] __raw_notifier_call_chain+0xe/0x10
> > [ 227.355469] [<ffffffff8104d5a3>] cpu_notify+0x23/0x50
> > [ 227.355473] [<ffffffff8104d5de>] cpu_notify_nofail+0xe/0x20
> > [ 227.355482] [<ffffffff815fafad>] _cpu_down+0x1ad/0x330
> > [ 227.355486] [<ffffffff815fb166>] cpu_down+0x36/0x50
> > [ 227.355493] [<ffffffff814ad8cd>] cpu_subsys_offline+0x1d/0x30
> > [ 227.355498] [<ffffffff814a8de5>] device_offline+0x95/0xc0
> > [ 227.355502] [<ffffffff814a8ee2>] store_online+0x42/0x90
> > [ 227.355506] [<ffffffff814a64f8>] dev_attr_store+0x18/0x30
> > [ 227.355513] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
> > [ 227.355517] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
> > [ 227.355522] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
> > [ 227.355527] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
> > [ 227.355531]
> > -> #1 (cpu_hotplug.lock){+.+.+.}:
> > [ 227.355535] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> > [ 227.355541] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
> > [ 227.355545] [<ffffffff8104d54b>] cpu_hotplug_begin+0x2b/0x60
> > [ 227.355549] [<ffffffff8104d61a>] _cpu_up+0x2a/0x170
> > [ 227.355552] [<ffffffff8104d7b9>] cpu_up+0x59/0x80
> > [ 227.355558] [<ffffffff81cf51b6>] smp_init+0x64/0x95
> > [ 227.355566] [<ffffffff81cdaf21>] kernel_init_freeable+0x84/0x191
> > [ 227.355570] [<ffffffff815fa34e>] kernel_init+0xe/0x180
> > [ 227.355574] [<ffffffff8160c9ac>] ret_from_fork+0x7c/0xb0
> > [ 227.355578]
> > -> #0 (cpu_add_remove_lock){+.+.+.}:
> > [ 227.355582] [<ffffffff810b8106>] __lock_acquire+0x1766/0x1d30
> > [ 227.355586] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> > [ 227.355590] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
> > [ 227.355594] [<ffffffff8104dab4>] disable_nonboot_cpus+0x24/0x120
> > [ 227.355601] [<ffffffff810a0973>] suspend_devices_and_enter+0x1f3/0x680
> > [ 227.355605] [<ffffffff810a0fd2>] pm_suspend+0x1d2/0x240
> > [ 227.355609] [<ffffffff8109fa19>] state_store+0x79/0xf0
> > [ 227.355614] [<ffffffff81312dbf>] kobj_attr_store+0xf/0x20
> > [ 227.355618] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
> > [ 227.355621] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
> > [ 227.355624] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
> > [ 227.355628] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
> > [ 227.355629]
> > other info that might help us debug this:
> >
> > [ 227.355635] Chain exists of:
> > cpu_add_remove_lock --> cpu_hotplug.lock --> console_lock
> >
> > [ 227.355637] Possible unsafe locking scenario:
> >
> > [ 227.355638] CPU0 CPU1
> > [ 227.355639] ---- ----
> > [ 227.355642] lock(console_lock);
> > [ 227.355644] lock(cpu_hotplug.lock);
> > [ 227.355647] lock(console_lock);
> > [ 227.355650] lock(cpu_add_remove_lock);
> > [ 227.355651]
> > *** DEADLOCK ***
> >
> > [ 227.355653] 5 locks held by systemd-sleep/2280:
> > [ 227.355661] #0: (sb_writers#6){.+.+.+}, at: [<ffffffff8117a78b>] vfs_write+0x1bb/0x1e0
> > [ 227.355668] #1: (&buffer->mutex){+.+.+.}, at: [<ffffffff811ec67c>] sysfs_write_file+0x3c/0x150
> > [ 227.355676] #2: (s_active#110){.+.+.+}, at: [<ffffffff811ec703>] sysfs_write_file+0xc3/0x150
> > [ 227.355683] #3: (pm_mutex){+.+.+.}, at: [<ffffffff810a0e32>] pm_suspend+0x32/0x240
> > [ 227.355690] #4: (console_lock){+.+.+.}, at: [<ffffffff8104c956>] suspend_console+0x26/0x40
> > [ 227.355691]
> > stack backtrace:
> > [ 227.355695] CPU: 0 PID: 2280 Comm: systemd-sleep Not tainted 3.11.0-rc1-dbg-01398-gf537e41-dirty #1838
> > [ 227.355697] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> > [ 227.355703] ffffffff82208680 ffff88015151bbc8 ffffffff81603038 ffffffff822073f0
> > [ 227.355707] ffff88015151bc08 ffffffff815ffdaa ffff880153389fa0 ffff88015338a788
> > [ 227.355712] 1d81e4832c04c441 ffff88015338a760 ffff88015338a788 ffff880153389fa0
> > [ 227.355713] Call Trace:
> > [ 227.355719] [<ffffffff81603038>] dump_stack+0x4e/0x82
> > [ 227.355723] [<ffffffff815ffdaa>] print_circular_bug+0x2b6/0x2c5
> > [ 227.355727] [<ffffffff810b8106>] __lock_acquire+0x1766/0x1d30
> > [ 227.355733] [<ffffffff81054aec>] ? walk_system_ram_range+0x5c/0x140
> > [ 227.355737] [<ffffffff810b63f4>] ? mark_held_locks+0x94/0x140
> > [ 227.355741] [<ffffffff810b8fb4>] lock_acquire+0xa4/0x200
> > [ 227.355745] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
> > [ 227.355749] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
> > [ 227.355753] [<ffffffff81605b47>] mutex_lock_nested+0x67/0x410
> > [ 227.355757] [<ffffffff8104dab4>] ? disable_nonboot_cpus+0x24/0x120
> > [ 227.355761] [<ffffffff81607e9e>] ? mutex_unlock+0xe/0x10
> > [ 227.355768] [<ffffffff8135bd2f>] ? acpi_os_get_iomem+0x4c/0x54
> > [ 227.355772] [<ffffffff8104dab4>] disable_nonboot_cpus+0x24/0x120
> > [ 227.355777] [<ffffffff810a0973>] suspend_devices_and_enter+0x1f3/0x680
> > [ 227.355780] [<ffffffff815fefc6>] ? printk+0x67/0x69
> > [ 227.355785] [<ffffffff810a0fd2>] pm_suspend+0x1d2/0x240
> > [ 227.355789] [<ffffffff8109fa19>] state_store+0x79/0xf0
> > [ 227.355792] [<ffffffff81312dbf>] kobj_attr_store+0xf/0x20
> > [ 227.355796] [<ffffffff811ec71b>] sysfs_write_file+0xdb/0x150
> > [ 227.355799] [<ffffffff8117a68d>] vfs_write+0xbd/0x1e0
> > [ 227.355804] [<ffffffff81198490>] ? fget_light+0x320/0x4b0
> > [ 227.355808] [<ffffffff8117ad7c>] SyS_write+0x4c/0xa0
> > [ 227.355811] [<ffffffff8160cbfe>] tracesys+0xd0/0xd5
> > [ 227.355814] Disabling non-boot CPUs ...
> > [ 227.357731] smpboot: CPU 1 is now offline
> > [ 227.461072] smpboot: CPU 2 is now offline
> > [ 227.565119] smpboot: CPU 3 is now offline
> >
> >
>
> This also looks like a different issue altogether, and IMHO deserves
> attention in a separate, dedicated email thread. Can you post it in a
> new thread please?
>
> Also, since you didn't get the original lockdep warning you reported,
> and since you didn't hit the IPI-to-offline-cpus warnings as well, I
> think we can safely conclude that my patches fixed your original problem.
>
> Rafael, could you kindly pick up this second patch[2] as well (with CC
> to stable)? (I'm aware that you already picked up the first one[1]).

Sure, I will.

Thanks a lot for working on this!

Rafael


> [1]. https://lkml.org/lkml/2013/7/11/661
> [2]. http://marc.info/?l=linux-kernel&m=137389460805002&w=2
>
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.