Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760242Ab1D0X3H (ORCPT ); Wed, 27 Apr 2011 19:29:07 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:49174 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757162Ab1D0X3F (ORCPT ); Wed, 27 Apr 2011 19:29:05 -0400 MIME-Version: 1.0 In-Reply-To: References: <20110425214933.GO2468@linux.vnet.ibm.com> <20110426081904.0d2b1494@pluto.restena.lu> <20110426112756.GF4308@linux.vnet.ibm.com> <20110426183859.6ff6279b@neptune.home> <20110426190918.01660ccf@neptune.home> <20110427081501.5ba28155@pluto.restena.lu> <20110427204139.1b0ea23b@neptune.home> <20110427222727.GU2135@linux.vnet.ibm.com> From: Linus Torvalds Date: Wed, 27 Apr 2011 16:28:09 -0700 Message-ID: Subject: Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning, regression? To: Thomas Gleixner Cc: "Paul E. McKenney" , =?ISO-8859-1?Q?Bruno_Pr=E9mont?= , Ingo Molnar , Peter Zijlstra , Mike Frysinger , KOSAKI Motohiro , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, "Paul E. McKenney" , Pekka Enberg Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1837 Lines: 47 On Wed, Apr 27, 2011 at 3:32 PM, Thomas Gleixner wrote: > > Well that's going to paper over the problem at hand possibly. I really > don't see why that thing would run for more than 950ms in a row even > if there is a large number of callbacks pending. Stop with this bogosity already, guys. We _know_ it didn't run continuously for 950ms. That number is totally made up. There's not enough work for it to run that long, but more importantly, the thread has zero CPU time. There is _zero_ reason to believe that it runs for long periods. There is some scheduler bug, probably the rt_time hasn't been initialized at all, or runtime we compare against is zero, or the calculations are just wrong. The 950ms didn't happen. Stop harping on it. It almost certainly simply doesn't exist. Since that if (rt_rq->rt_time > runtime) { rt_rq->rt_throttled = 1; + printk_once(KERN_WARNING "sched: RT throttling activated\n"); test triggers, we know that either 'runtime' or 'rt_time' is just bogus. Make the printk print out the values, and maybe that gives some hints. But in the meantime, I'd suggest looking for the places that initialize or calculate those values, and just assume that some of them are buggy. > And then I don't have an explanation for the hosed CPU accounting and > why that thing does not get another 950ms RT time when the 50ms > throttling break is over. Again, don't even bother talking about "another 950ms". It didn't happen in the first place, there's no "another" there either. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/