Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758128Ab1EYQHJ (ORCPT ); Wed, 25 May 2011 12:07:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58808 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757457Ab1EYQHH (ORCPT ); Wed, 25 May 2011 12:07:07 -0400 Date: Wed, 25 May 2011 18:06:59 +0200 From: Andrea Arcangeli To: Johannes Hirte Cc: Ulrich Keller , linux-kernel@vger.kernel.org, Thomas Sattler Subject: Re: iotop: khugepaged at 99.99% (2.6.38.3) Message-ID: <20110525160659.GE19505@random.random> References: <4DAF6C0B.3070009@gmx.de> <20110512140352.GG11579@random.random> <201105232005.56840.johannes.hirte@fem.tu-ilmenau.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201105232005.56840.johannes.hirte@fem.tu-ilmenau.de> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1486 Lines: 31 On Mon, May 23, 2011 at 08:05:55PM +0200, Johannes Hirte wrote: > Is there any progress on this? I've observed this behavior different times too, > with kernel 2.6.39-rc7. After a while working some processes (kmail, > akregator, konqueror) got stuck in D state together with the khugepaged task. > I could kill the hanging process (kill -n 9) but the khugepaged task stayed in > D state. > The system is a Pentium M (Banias) with 1.3GHz and 1.5G RAM. Attached is the > output from multiple SYSRQ+T, content from /proc/zoneinfo and the config. great progress thanks to your info. nr_isolated_file 4294967295 0xffffffff I didn't check all data you provided yet but the critical bit is we're off by one in nr_isolated_file and that's the bug, looks unrelated to anon memory and THP. I'm reviewing more but I recommend everyone to check this nr_isolated_file code, hoping it's not random memory corruption. Why it only happened on 32bit so far is unclear though, you're 32bit too. I doubt it's an overflow, the value is unsigned too, so it has to be an underflow and that should have happened on 64bit too in theory. You and Ulrich both have PREEMPT=y so it may be a preempt bug. What about you Thomas, do you have PREEMPT=y too? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/