Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751326AbdGMJnN (ORCPT ); Thu, 13 Jul 2017 05:43:13 -0400 Received: from mail-pg0-f53.google.com ([74.125.83.53]:36182 "EHLO mail-pg0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751046AbdGMJnL (ORCPT ); Thu, 13 Jul 2017 05:43:11 -0400 Subject: Re: Regression with suspicious RCU usage splats with cpu_pm change To: Tony Lindgren , "Rafael J. Wysocki" Cc: "Paul E. McKenney" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-omap@vger.kernel.org, Catalin Marinas , Will Deacon References: <20170713070749.GE16509@atomide.com> From: Alex Shi Message-ID: Date: Thu, 13 Jul 2017 17:43:06 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170713070749.GE16509@atomide.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2092 Lines: 59 On 07/13/2017 03:07 PM, Tony Lindgren wrote: > Hi, > > Looks like next-20170713 gives me a bunch of "suspicious RCU usage" > splats with cpuidle_coupled on duovero, see below. I bisected it down > to commit 2f027e003d05 ("cpu_pm: replace raw_notifier with > atomic_notifier"). > > Regards, > > Tony > > 8< ----------------------- > ============================= > WARNING: suspicious RCU usage > 4.12.0-next-20170713+ #118 Tainted: G W > ----------------------------- > ./include/linux/rcupdate.h:611 rcu_read_lock() used illegally while idle! > [ 2.928802] > other info that might help us debug this: > [ 2.928802] > [ 2.946777] > RCU used illegally from idle CPU! > rcu_scheduler_active = 2, debug_locks = 1 > RCU used illegally from extended quiescent state! CC Catalin & Will, It's a shame for me. :( lockdep_rcu_suspicious() explained, rcu_read_lock can not be used after rcu_idle_enter(): * If a CPU is in the RCU-free window in idle (ie: in the section * between rcu_idle_enter() and rcu_idle_exit(), then RCU * considers that CPU to be in an "extended quiescent state", * which means that RCU will be completely ignoring that CPU. * Therefore, rcu_read_lock() and friends have absolutely no * effect on a CPU running in that state. In other words, even if * such an RCU-idle CPU has called rcu_read_lock(), RCU might well * delete data structures out from under it. RCU really has no * choice here: we need to keep an RCU-free window in idle where * the CPU may possibly enter into low power mode. This way we can * notice an extended quiescent state to other CPUs that started a grace * period. Otherwise we would delay any grace period as long as we run * in the idle task. Although the cpu is still alive and not going to idle yet, the RCU is starting to count this cpu into quiescent state. I guess it's not so good to put off rcu_idle_enter for all archs. We need another solution on this problem, maybe unsleepable raw_rwlock? or may best way is split the notification and the real idle trigger in the function arm_enter_idle_state()? Regards Alex