Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762824AbXJOLlS (ORCPT ); Mon, 15 Oct 2007 07:41:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757183AbXJOLlC (ORCPT ); Mon, 15 Oct 2007 07:41:02 -0400 Received: from thunk.org ([69.25.196.29]:44293 "EHLO thunker.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755700AbXJOLk7 (ORCPT ); Mon, 15 Oct 2007 07:40:59 -0400 Date: Mon, 15 Oct 2007 07:40:03 -0400 From: Theodore Tso To: Nick Piggin Cc: Rob Landley , James Bottomley , Matthew Wilcox , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, Jens Axboe , Suparna Bhattacharya , Nick Piggin Subject: Re: OOM killer gripe (was Re: What still uses the block layer?) Message-ID: <20071015114003.GB21216@thunk.org> Mail-Followup-To: Theodore Tso , Nick Piggin , Rob Landley , James Bottomley , Matthew Wilcox , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, Jens Axboe , Suparna Bhattacharya , Nick Piggin References: <200710112011.22000.rob@landley.net> <20071015014503.GF9715@thunk.org> <200710150304.00901.rob@landley.net> <200710152337.45252.nickpiggin@yahoo.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200710152337.45252.nickpiggin@yahoo.com.au> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on thunker.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2999 Lines: 59 On Mon, Oct 15, 2007 at 11:37:44PM +1000, Nick Piggin wrote: > I hate to go completely offtopic here, but disks are so incredibly > slow when compared to RAM that there is really nothing the kernel > can do about this. Presumably the job will finish, given infinite > time. About 6 weeks ago, on a 2.6.23-rc kernel, I accidentally typed "make -j", and left off the 4 before I hit the return key. About 2-3 minutes later, the box locked pretty tight. I managed to switch to a VT console before I lost total control of X (took many, many minutes to do the switch), but after many minutes, managed to get logged into the console, but I wasn't able to get a ps command to complete so I could start killing processes. (I probably should have just done a "killall make" right away, but hindsight is 20/20.) The console was showing that the OOM killer was attempting to kill processes, but apparently not fast enough to stem the tide of all of the new processes getting generated by the make -j. (I'm guessing that it was killing the gcc processes and not the make processes.) > Would an oom-kill-someone-now sysrq be of help, I wonder? I tried sysrq-f (oom_kill), but no dice. Given that the oom killer was active and apparently triggering on its own, this wasn't all that surprising. The interesting thing is I tried to do an sysrq-e (send SIGTERM to all processes except), waited 5 minutes or so, then tried an alt-sysrq-i (send SIGKILL to all processes except init), and the system was still thrashing itself to death, even after giving it plenty of time to try to recover. I finally gave up and held down the power button. This was on a box with 4 gigs memory (but only 3 gigs visible thanks a cheap BIOS/chipset) and 4 gigs swap (mainly intended for suspend/resume). I chalked it up to me being stupid (I should have noticed and Ctrl-C'ed the make -j much more quickly, or if I were a sysadmin on a time-sharing system with users I didn't trust, configured RLIMIT_NPROC and/or per-user container resource limits) and the OOM killer not being aggressive enough in such a situation. But having better things to do, I didn't go whining on LKML about it, although I have to say that the kernel behavior isn't exactly ideal. One of these days when I have time, I'll try investigating it with a few memlocked processes running at real-time priorities and Systemtap and figure out what the heck was going on.... I suppose I should just configure suspending to a file instead of a swap partition, but I've just historically trusted suspend/resume to a swap partition much more than to a file. Or maybe I should hack in a sysctl to prevent any swapping even though the swap partition is configured (so only suspend/resume will use it). - Ted - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/