Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753251AbdGSIBe (ORCPT ); Wed, 19 Jul 2017 04:01:34 -0400 Received: from muru.com ([72.249.23.125]:55086 "EHLO muru.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752808AbdGSIBT (ORCPT ); Wed, 19 Jul 2017 04:01:19 -0400 Date: Wed, 19 Jul 2017 01:01:15 -0700 From: Tony Lindgren To: "Paul E. McKenney" Cc: Alex Shi , "Rafael J. Wysocki" , Linux Kernel Mailing List , Linux PM , Linux OMAP Mailing List , Catalin Marinas , Will Deacon Subject: Re: Regression with suspicious RCU usage splats with cpu_pm change Message-ID: <20170719080115.GN10026@atomide.com> References: <20170713070749.GE16509@atomide.com> <20170713124318.GK2393@linux.vnet.ibm.com> <20170717060806.GC10026@atomide.com> <20170717123938.GP2393@linux.vnet.ibm.com> <20170718054137.GK10026@atomide.com> <20170718170001.GD3981@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170718170001.GD3981@linux.vnet.ibm.com> User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2863 Lines: 66 * Paul E. McKenney [170718 10:00]: > On Mon, Jul 17, 2017 at 10:41:38PM -0700, Tony Lindgren wrote: > > * Paul E. McKenney [170717 05:40]: > > > On Sun, Jul 16, 2017 at 11:08:07PM -0700, Tony Lindgren wrote: > > > > * Alex Shi [170716 16:25]: > > > > > I reused the rcu_irq_enter_irqson() from RCU_NONIDLE to avoid this issue. > > > > > It works fine. > > > > > > > > > > Tony, Could you like to give a tested-by if this patch works for you. > > > > > > > > Yeah that keeps things booting for me with no splats so: > > > > > > > > Tested-by: Tony Lindgren > > > > > > > > In general, it seems we're missing the knowledge in Linux kernel > > > > of when the entire system is idle. Right now it seems that only > > > > cpuidle_coupled knows that? > > > > > > > > We could probably simplify things by adding some PM state for > > > > entire system idle. Then cpuidle code and timer code could use > > > > that to test when it's safe to do whatever the SoC needs to do > > > > to enter deeper power states. > > > > > > > > If we already have something like that, please do let me know :) > > > > > > Well, we used to have CONFIG_NO_HZ_FULL_SYSIDLE, which detected > > > full-system idle lazily so as to avoid scalability bottlenecks. > > > https://lwn.net/Articles/558284/ > > > > > > No one was using it, so I removed it last merge window. The > > > patch that removed it is at sysidle.2017.05.11a, which can > > > probably still be reverted cleanly. Or just use v4.11 or earlier. > > > > OK thanks for the pointer, for reference that commit is > > fe5ac724d81a ("rcu: Remove nohz_full full-system-idle state > > machine"). > > > > For a potential user, I think we could use it for example in > > cpuidle_enter_state_coupled() + omap_enter_idle_coupled() where > > we try to figure out if the system is fully idle before calling > > tick_broadcast_enter(). > > Would you be willing to prototype your usage on v4.12? It still has > NO_HZ_FULL_SYSIDLE. You have to enable NO_HZ_FULL in order to enable > NO_HZ_FULL_SYSIDLE at the moment. > > Either way, here is the important bit for usage: > > bool rcu_sys_is_idle(void); > void rcu_sysidle_force_exit(void); > > The rcu_sys_is_idle() function returns true if all CPUs other than the > time-keeping CPU (that is, tick_do_timer_cpu, which is usually CPU 0) > are in their idle loop. Of course, if you invoke rcu_sys_is_idle() > from any CPU other than the time-keeping CPU, you will automatically > get a return value of false. > > RCU's idle-exit code already sets state appropriately, but if there > is some other circumstance where you need to force the state machine > out of all-CPUs-idle state, you can call rcu_sysidle_force_exit(). OK sure I'll take a look at some point. Thanks, Tony