Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932691Ab2BNXWY (ORCPT ); Tue, 14 Feb 2012 18:22:24 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:43306 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757285Ab2BNXWW (ORCPT ); Tue, 14 Feb 2012 18:22:22 -0500 Date: Tue, 14 Feb 2012 15:22:20 -0800 From: Andrew Morton To: Andrea Righi Cc: Minchan Kim , Peter Zijlstra , Johannes Weiner , KAMEZAWA Hiroyuki , KOSAKI Motohiro , Rik van Riel , Hugh Dickins , Alexander Viro , Shaohua Li , =?ISO-8859-1?Q?P=E1draig?= Brady , John Stultz , Jerry James , Julius Plenz , linux-mm , linux-fsdevel@vger.kernel.org, LKML Subject: Re: [RFC] [PATCH v5 0/3] fadvise: support POSIX_FADV_NOREUSE Message-Id: <20120214152220.4f621975.akpm@linux-foundation.org> In-Reply-To: <20120214225922.GA12394@thinkpad> References: <1329006098-5454-1-git-send-email-andrea@betterlinux.com> <20120214133337.9de7835b.akpm@linux-foundation.org> <20120214225922.GA12394@thinkpad> X-Mailer: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3497 Lines: 74 On Tue, 14 Feb 2012 23:59:22 +0100 Andrea Righi wrote: > On Tue, Feb 14, 2012 at 01:33:37PM -0800, Andrew Morton wrote: > > On Sun, 12 Feb 2012 01:21:35 +0100 > > Andrea Righi wrote: > > > > > The new proposal is to implement POSIX_FADV_NOREUSE as a way to perform a real > > > drop-behind policy where applications can mark certain intervals of a file as > > > FADV_NOREUSE before accessing the data. > > > > I think you and John need to talk to each other, please. The amount of > > duplication here is extraordinary. > > Yes, definitely. I'm currently reviewing and testing the John's patch > set. I was even considering to apply my patch set on top of the John's > patch, or at least propose my tree-based approach to manage the list of > the POSIX_FADV_VOLATILE ranges. Cool. > > > > Both patchsets add fields to the address_space (and hence inode), which > > is significant - we should convince ourselves that we're getting really > > good returns from a feature which does this. > > > > > > > > Regarding the use of fadvise(): I suppose it's a reasonable thing to do > > in the long term - if the feature works well, popular data streaming > > applications will eventually switch over. But I do think we should > > explore interfaces which don't require modification of userspace source > > code. Because there will always be unconverted applications, and the > > feature becomes available immediately. > > > > One such interface would be to toss the offending application into a > > container which has a modified drop-behind policy. And here we need to > > drag out the crystal ball: what *is* the best way of tuning application > > pagecache behaviour? Will we gravitate towards containerization, or > > will we gravitate towards finer-tuned fadvise/sync_page_range/etc > > behaviour? Thus far it has been the latter, and I don't think that has > > been a great success. > > > > Finally, are the problems which prompted these patchsets already > > solved? What happens if you take the offending streaming application > > and toss it into a 16MB memcg? That *should* avoid perturbing other > > things running on that machine. > > Moving the streaming application into a 16MB memcg can be dangerous in > some cases... the application might start to do "bad" things, like > swapping (if the memcg can swap) or just fail due to OOMs. Well OK, maybe there are problems with the current implementation. But are they unfixable problems? Is the right approach to give up on ever making containers useful for this application and to instead go off and implement a new and separate feature? > > And yes, a container-based approach is pretty crude, and one can > > envision applications which only want modified reclaim policy for one > > particualr file. But I suspect an application-wide reclaim policy > > solves 90% of the problems. > > I really like the container-based approach. But for this we need a > better file cache control in the memory cgroup; now we have the > accounting of file pages, but there's no way to limit them. Again, if/whem memcg becomes sufficiently useful for this application we're left maintaining the obsolete POSIX_FADVISE_NOREUSE for ever. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/