Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751245AbaLPG3K (ORCPT ); Tue, 16 Dec 2014 01:29:10 -0500 Received: from mail-oi0-f52.google.com ([209.85.218.52]:48638 "EHLO mail-oi0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750796AbaLPG3I (ORCPT ); Tue, 16 Dec 2014 01:29:08 -0500 MIME-Version: 1.0 In-Reply-To: References: Date: Tue, 16 Dec 2014 11:59:07 +0530 Message-ID: Subject: Re: [RCU] kernel hangs in wait_rcu_gp during suspend path From: Arun KS To: "linux-kernel@vger.kernel.org" Cc: Paul McKenney , josh@joshtriplett.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, laijs@cn.fujitsu.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, I dig little deeper to understand the situation. All other cpus are in idle thread already. As per my understanding, for the grace period to end, at-least one of the following should happen on all online cpus, 1. a context switch. 2. user space switch. 3. switch to idle thread. In this situation, since all the other cores are already in idle, non of the above are meet on all online cores. So grace period is getting extended and never finishes. Below is the state of runqueue when the hang happens. --------------start------------------------------------ crash> runq CPU 0 [OFFLINE] CPU 1 [OFFLINE] CPU 2 [OFFLINE] CPU 3 [OFFLINE] CPU 4 RUNQUEUE: c3192e40 CURRENT: PID: 0 TASK: f0874440 COMMAND: "swapper/4" RT PRIO_ARRAY: c3192f20 [no tasks queued] CFS RB_ROOT: c3192eb0 [no tasks queued] CPU 5 RUNQUEUE: c31a0e40 CURRENT: PID: 0 TASK: f0874980 COMMAND: "swapper/5" RT PRIO_ARRAY: c31a0f20 [no tasks queued] CFS RB_ROOT: c31a0eb0 [no tasks queued] CPU 6 RUNQUEUE: c31aee40 CURRENT: PID: 0 TASK: f0874ec0 COMMAND: "swapper/6" RT PRIO_ARRAY: c31aef20 [no tasks queued] CFS RB_ROOT: c31aeeb0 [no tasks queued] CPU 7 RUNQUEUE: c31bce40 CURRENT: PID: 0 TASK: f0875400 COMMAND: "swapper/7" RT PRIO_ARRAY: c31bcf20 [no tasks queued] CFS RB_ROOT: c31bceb0 [no tasks queued] --------------end------------------------------------ If my understanding is correct the below patch should help, because it will expedite grace periods during suspend, https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d1d74d14e98a6be740a6f12456c7d9ad47be9c9c But I wonder why it was not taken to stable trees. Can we take it? Appreciate your help. Thanks, Arun On Mon, Dec 15, 2014 at 10:34 PM, Arun KS wrote: > Hi, > > Here is the backtrace of the process hanging in wait_rcu_gp, > > PID: 247 TASK: e16e7380 CPU: 4 COMMAND: "kworker/u16:5" > #0 [] (__schedule) from [] > #1 [] (schedule_timeout) from [] > #2 [] (wait_for_common) from [] > #3 [] (wait_rcu_gp) from [] > #4 [] (atomic_notifier_chain_unregister) from [] > #5 [] (cpufreq_interactive_disable_sched_input) from [] > #6 [] (cpufreq_governor_interactive) from [] > #7 [] (__cpufreq_governor) from [] > #8 [] (__cpufreq_remove_dev_finish) from [] > #9 [] (cpufreq_cpu_callback) from [] > #10 [] (notifier_call_chain) from [] > #11 [] (__cpu_notify) from [] > #12 [] (cpu_notify_nofail) from [] > #13 [] (_cpu_down) from [] > #14 [] (disable_nonboot_cpus) from [] > #15 [] (suspend_devices_and_enter) from [] > #16 [] (pm_suspend) from [] > #17 [] (try_to_suspend) from [] > #18 [] (process_one_work) from [] > #19 [] (worker_thread) from [] > #20 [] (kthread) from [] > > Will this patch helps here, > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d1d74d14e98a6be740a6f12456c7d9ad47be9c9c > > I couldn't really understand why it got struck in synchronize_rcu(). > Please give some pointers to debug this further. > > Below are the configs enable related to RCU. > > CONFIG_TREE_PREEMPT_RCU=y > CONFIG_PREEMPT_RCU=y > CONFIG_RCU_STALL_COMMON=y > CONFIG_RCU_FANOUT=32 > CONFIG_RCU_FANOUT_LEAF=16 > CONFIG_RCU_FAST_NO_HZ=y > CONFIG_RCU_CPU_STALL_TIMEOUT=21 > CONFIG_RCU_CPU_STALL_VERBOSE=y > > Kernel version is 3.10.28 > Architecture is ARM > > Thanks, > Arun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/