Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933436Ab1D1WCZ (ORCPT ); Thu, 28 Apr 2011 18:02:25 -0400 Received: from www.linutronix.de ([62.245.132.108]:59253 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932352Ab1D1WCY (ORCPT ); Thu, 28 Apr 2011 18:02:24 -0400 Date: Fri, 29 Apr 2011 00:02:00 +0200 (CEST) From: Thomas Gleixner To: john stultz cc: =?ISO-8859-15?Q?Bruno_Pr=E9mont?= , sedat.dilek@gmail.com, Mike Galbraith , "Paul E. McKenney" , Linus Torvalds , Ingo Molnar , Peter Zijlstra , Mike Frysinger , KOSAKI Motohiro , LKML , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, "Paul E. McKenney" , Pekka Enberg Subject: Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning, regression? In-Reply-To: <1304027480.2971.121.camel@work-vm> Message-ID: References: <20110426112756.GF4308@linux.vnet.ibm.com> <20110426183859.6ff6279b@neptune.home> <20110426190918.01660ccf@neptune.home> <20110427081501.5ba28155@pluto.restena.lu> <20110427204139.1b0ea23b@neptune.home> <20110428102609.GJ2135@linux.vnet.ibm.com> <1303997401.7819.5.camel@marge.simson.net> <20110428222301.0b745a0a@neptune.home> <20110428224444.43107883@neptune.home> <1304027480.2971.121.camel@work-vm> User-Agent: Alpine 2.02 (LFD 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2189 Lines: 63 On Thu, 28 Apr 2011, john stultz wrote: > On Thu, 2011-04-28 at 23:04 +0200, Thomas Gleixner wrote: > > /me suspects hrtimer changes to be the real culprit. > > I'm not seeing anything on right off, but it does smell like > e06383db9ec591696a06654257474b85bac1f8cb would be where such an issue > would crop up. > > Bruno, could you try checking out e06383db9ec, confirming it still > occurs (and then maybe seeing if it goes away at e06383db9ec^1)? > > I'll keep digging in the meantime. I found the bug already. The problem is that sched_init() calls init_rt_bandwidth() which calls hrtimer_init() _BEFORE_ hrtimers_init() is called. That was unnoticed so far as the CLOCK id to hrtimer base conversion was hardcoded. Now we use a table which is set up at hrtimers_init(), so the bandwith hrtimer ends up on CLOCK_REALTIME because the table is in the bss. The patch below fixes this, by providing the table statically rather than runtime initialized. Though that whole ordering wants to be revisited. Thanks, tglx --- linux-2.6.orig/kernel/hrtimer.c +++ linux-2.6/kernel/hrtimer.c @@ -81,7 +81,11 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base, } }; -static int hrtimer_clock_to_base_table[MAX_CLOCKS]; +static int hrtimer_clock_to_base_table[MAX_CLOCKS] = { + [CLOCK_REALTIME] = HRTIMER_BASE_REALTIME, + [CLOCK_MONOTONIC] = HRTIMER_BASE_MONOTONIC, + [CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME, +}; static inline int hrtimer_clockid_to_base(clockid_t clock_id) { @@ -1722,10 +1726,6 @@ static struct notifier_block __cpuinitda void __init hrtimers_init(void) { - hrtimer_clock_to_base_table[CLOCK_REALTIME] = HRTIMER_BASE_REALTIME; - hrtimer_clock_to_base_table[CLOCK_MONOTONIC] = HRTIMER_BASE_MONOTONIC; - hrtimer_clock_to_base_table[CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME; - hrtimer_cpu_notify(&hrtimers_nb, (unsigned long)CPU_UP_PREPARE, (void *)(long)smp_processor_id()); register_cpu_notifier(&hrtimers_nb); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/