Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752215AbbBEVo4 (ORCPT ); Thu, 5 Feb 2015 16:44:56 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48958 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751261AbbBEVoz (ORCPT ); Thu, 5 Feb 2015 16:44:55 -0500 Message-ID: <54D3E44B.7060501@redhat.com> Date: Thu, 05 Feb 2015 16:44:43 -0500 From: Rik van Riel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Mel Gorman , Andrew Morton CC: linux-mm@kvack.org, Minchan Kim , Vlastimil Babka , linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints References: <20150202165525.GM2395@suse.de> <20150202140506.392ff6920743f19ea44cff59@linux-foundation.org> <20150202221824.GN2395@suse.de> In-Reply-To: <20150202221824.GN2395@suse.de> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2970 Lines: 70 On 02/02/2015 05:18 PM, Mel Gorman wrote: > On Mon, Feb 02, 2015 at 02:05:06PM -0800, Andrew Morton wrote: >> On Mon, 2 Feb 2015 16:55:25 +0000 Mel Gorman wrote: >> >>> glibc malloc changed behaviour in glibc 2.10 to have per-thread arenas >>> instead of creating new areans if the existing ones were contended. >>> The decision appears to have been made so the allocator scales better but the >>> downside is that madvise(MADV_DONTNEED) is now called for these per-thread >>> areans during free. This tears down pages that would have previously >>> remained. There is nothing wrong with this decision from a functional point >>> of view but any threaded application that frequently allocates/frees the >>> same-sized region is going to incur the full teardown and refault costs. >> >> MADV_DONTNEED has been there for many years. How could this problem >> not have been noticed during glibc 2.10 development/testing? > > I do not know. I only spotted it due to switching distributions. Looping > allocations and frees of the same sizes is considered inefficient and it > might have been dismissed on those grounds. It's probably less noticeable > when it only affects threaded applications. > >> Is there >> some more recent kernel change which is triggering this? >> > > Not that I'm aware of. > >>> This patch identifies when a thread is frequently calling MADV_DONTNEED >>> on the same region of memory and starts ignoring the hint. >> >> That's pretty nasty-looking :( >> > > Yep, it is but we're very limited in terms of what we can do within the > kernel here. > >> And presumably there are all sorts of behaviours which will still >> trigger the problem but which will avoid the start/end equality test in >> ignore_madvise_hint()? >> > > Yes. I would expect that a simple pattern of multiple allocs followed by > multiple frees in a loop would also trigger it. > >> Really, this is a glibc problem and only a glibc problem. >> MADV_DONTNEED is unavoidably expensive and glibc is calling >> MADV_DONTNEED for a region which it *does* need. > > To be fair to glibc, it calls it on a region it *thinks* it doesn't need only > to reuse it immediately afterwards because of how the benchmark is > implemented. > >> Is there something >> preventing this from being addressed within glibc? > > I doubt it other than I expect they'll punt it back and blame either the > application for being stupid or the kernel for being slow. This sounds like something that could benefit from Minchan's MADV_FREE, instead of MADV_DONTNEED. If non page aligned malloc/free does not depend on pages being zeroed, I suspect an MADV_DONTNEED resulting from a malloc/free loop also does not depend on it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/