Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932094AbcK3SWB (ORCPT ); Wed, 30 Nov 2016 13:22:01 -0500 Received: from magic.merlins.org ([209.81.13.136]:37902 "EHLO mail1.merlins.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754134AbcK3SVw (ORCPT ); Wed, 30 Nov 2016 13:21:52 -0500 Date: Wed, 30 Nov 2016 10:21:44 -0800 From: Marc MERLIN To: Linus Torvalds Cc: Kent Overstreet , Tejun Heo , Jens Axboe , Michal Hocko , Vlastimil Babka , linux-mm , LKML , Joonsoo Kim , Greg Kroah-Hartman Message-ID: <20161130182144.xhnmgpsyyv423pqw@merlins.org> References: <20161123063410.GB2864@dhcp22.suse.cz> <20161128072315.GC14788@dhcp22.suse.cz> <20161129155537.f6qgnfmnoljwnx6j@merlins.org> <20161129160751.GC9796@dhcp22.suse.cz> <20161129163406.treuewaqgt4fy4kh@merlins.org> <20161129174019.fywddwo5h4pyix7r@merlins.org> <20161130174713.lhvqgophhiupzwrm@merlins.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Sysadmin: BOFH X-URL: http://marc.merlins.org/ User-Agent: NeoMutt/20160916 (1.7.0) X-SA-Exim-Connect-IP: 173.11.111.145 X-SA-Exim-Mail-From: marc@merlins.org X-Spam-Report: * -2.9 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain * 0.7 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) * -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * -1.5 GREYLIST_ISWHITE The incoming server has been whitelisted for this * receipient and sender Subject: Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1150 Lines: 26 On Wed, Nov 30, 2016 at 10:14:50AM -0800, Linus Torvalds wrote: > Anyway, none of this seems new per se. I'm adding Kent and Jens to the > cc (Tejun already was), in the hope that maybe they have some idea how > to control the nasty worst-case behavior wrt workqueue lockup (it's > not really a "lockup", it looks like it's just hundreds of workqueues > all waiting for IO to complete and much too deep IO queues). I'll take your word for it, all I got in the end was Kernel panic - not syncing: Hard LOCKUP and the system stone dead when I woke up hours later. > And I think your NMI watchdog then turns the "system is no longer > responsive" into an actual kernel panic. Ah, I see. Thanks for the reply, and sorry for bringing in that separate thread from the btrfs mailing list, which effectively was a suggestion similar to what you're saying here too. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901