Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756699AbeAIOLa (ORCPT + 1 other); Tue, 9 Jan 2018 09:11:30 -0500 Received: from mail-qt0-f196.google.com ([209.85.216.196]:46389 "EHLO mail-qt0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756598AbeAIOLY (ORCPT ); Tue, 9 Jan 2018 09:11:24 -0500 X-Google-Smtp-Source: ACJfBosNt2ORlay1unomPnziIwnJtOzHL6FaaEFmYIRlDwvobF+FAJGGhpDzKuu9nYLxDhuB/ue3pQ== Date: Tue, 9 Jan 2018 06:11:14 -0800 From: Tejun Heo To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: Can RCU stall lead to hard lockups? Message-ID: <20180109141114.GF3668920@devbig577.frc2.facebook.com> References: <20180109035207.GD3668920@devbig577.frc2.facebook.com> <20180109042425.GS9671@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180109042425.GS9671@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: Hello, Paul. On Mon, Jan 08, 2018 at 08:24:25PM -0800, Paul E. McKenney wrote: > > I don't know the RCU code at all but it *looks* like the first CPU is > > taking a sweet while flushing printk buffer while holding a lock (the > > console is IPMI serial console, which faithfully emulates 115200 baud > > rate), and everyone else seems stuck waiting for that spinlock in > > rcu_check_callbacks(). > > > > Does this sound possible? > > 115200 baud? Ouch!!! That -will- result in trouble from console > printing, and often also in RCU CPU stall warnings. It could even be slower than 115200, and we occassionally see RCU stall warnings caused by printk storms, for example, while the kernel is trying to dump a lot of info after an OOM. That's an issue we probably want to improve from printk side; however, they don't usually lead to NMI hard lockup detector kicking in and crashing the machine, which is the peculiarity here. Hmmm... show_state_filter(), the function which dumps all task backtraces, share a similar problem and it avoids it by explicitly calling touch_nmi_watchdog(). Maybe we can do something like the following from RCU too? diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index db85ca3..3c4c4d3 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -561,8 +561,14 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp) } t = list_entry(rnp->gp_tasks->prev, struct task_struct, rcu_node_entry); - list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) + list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) { + touch_nmi_watchdog(); + /* + * We could be printing a lot of these messages while + * holding a spinlock. Avoid triggering hard lockup. + */ sched_show_task(t); + } raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } @@ -1678,6 +1684,12 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu) char *ticks_title; unsigned long ticks_value; + /* + * We could be printing a lot of these messages while holding a + * spinlock. Avoid triggering hard lockup. + */ + touch_nmi_watchdog(); + if (rsp->gpnum == rdp->gpnum) { ticks_title = "ticks this GP"; ticks_value = rdp->ticks_this_gp;