Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1763669AbcLTNez (ORCPT ); Tue, 20 Dec 2016 08:34:55 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:14901 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754786AbcLTNex (ORCPT ); Tue, 20 Dec 2016 08:34:53 -0500 To: vegard.nossum@gmail.com Cc: linux-kernel@vger.kernel.org, vegard.nossum@oracle.com, akpm@linux-foundation.org, torvalds@linux-foundation.org, msb@chromium.org, paulmck@linux.vnet.ibm.com, peterz@infradead.org, tglx@linutronix.de, mingo@kernel.org Subject: Re: [PATCH] locking/hung_task: Defer showing held locks From: Tetsuo Handa References: <1481640325-7076-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> In-Reply-To: Message-Id: <201612202234.FJB65612.tLSVOFQFMHFJOO@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Tue, 20 Dec 2016 22:34:47 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2493 Lines: 55 Vegard Nossum wrote: > On 13 December 2016 at 15:45, Tetsuo Handa > wrote: > > When I was running my testcase which may block hundreds of threads > > on fs locks, I got lockup due to output from debug_show_all_locks() > > added by commit b2d4c2edb2e4f89a ("locking/hung_task: Show all locks"). > > > > I think we don't need to call debug_show_all_locks() on each blocked > > thread. Let's defer calling debug_show_all_locks() till before panic() > > or leaving for_each_process_thread() loop. > > First of all, sorry for not answering earlier. No problem. > > I'm not sure I fully understand the problem, you say the "output from > debug_show_all_locks()" caused a lockup, but was the problem simply > that the amount of output caused it to stall for a long time? In Linux 4.9, in order to tell administrator that something might be wrong with memory allocation, warn_alloc() which calls printk() periodically when memory allocation is stalling for too long was added. However, since printk() waits until all pending data is sent to console using cond_resched(), printk() continues waiting as long as somebody else calls printk() when cond_resched() is called. This is problematic under OOM situation. Since the OOM killer calls printk() with oom_lock held, it happened that printk() called from the OOM killer is forever unable to return because warn_alloc() periodically calls printk() since the OOM killer is holding oom_lock. And it happened that khungtaskd is another source which calls printk() periodically when threads are blocked on fs locks waiting for memory allocation. debug_show_all_locks() generates far more amount of output compared to warn_alloc() if debug_show_all_locks() is called on each thread blocked on fs locks waiting for memory allocation. Therefore, we should avoid calling debug_show_all_locks() on each blocked thread. Full story starts at http://lkml.kernel.org/r/1481020439-5867-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp but I appreciate if you can join on http://lkml.kernel.org/r/1478416501-10104-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp . > > Could we instead > > 1) move the debug_show_all_locks() into the if > (sysctl_hung_task_panic) bit unconditionally > > 2) call something (touch_nmi_watchdog()?) inside debug_show_all_locks() > > 3) in another way make debug_show_all_locks() more robust so it doesn't "lockup" > > ? Yes, that might be an improvement. But not needed for this patch.