Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752733Ab1F1WNg (ORCPT ); Tue, 28 Jun 2011 18:13:36 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:60776 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752392Ab1F1WNb (ORCPT ); Tue, 28 Jun 2011 18:13:31 -0400 Date: Tue, 28 Jun 2011 15:12:33 -0700 From: Andrew Morton To: Andrea Righi Cc: Minchan Kim , Peter Zijlstra , Johannes Weiner , KAMEZAWA Hiroyuki , Andrea Arcangeli , Hugh Dickins , Jerry James , Marcus Sorensen , Matt Heaton , KOSAKI Motohiro , Rik van Riel , Theodore Tso , Shaohua Li , =?ISO-8859-1?Q?P?= =?ISO-8859-1?Q?=E1draig?= Brady , linux-mm , LKML Subject: Re: [PATCH v4 0/2] fadvise: move active pages to inactive list with POSIX_FADV_DONTNEED Message-Id: <20110628151233.f0a279be.akpm@linux-foundation.org> In-Reply-To: <1309181361-14633-1-git-send-email-andrea@betterlinux.com> References: <1309181361-14633-1-git-send-email-andrea@betterlinux.com> X-Mailer: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2266 Lines: 47 On Mon, 27 Jun 2011 15:29:19 +0200 Andrea Righi wrote: > There were some reported problems in the past about trashing page cache when a > backup software (i.e., rsync) touches a huge amount of pages (see for example > [1]). > > This problem has been almost fixed by the Minchan Kim's patch [2] and a proper > use of fadvise() in the backup software. For example this patch set [3] has > been proposed for inclusion in rsync. > > However, there can be still other similar trashing problems: when the backup > software reads all the source files, some of them may be part of the actual > working set of the system. When a POSIX_FADV_DONTNEED is performed _all_ pages > are evicted from pagecache, both the working set and the use-once pages touched > only by the backup software. > > With the following solution when POSIX_FADV_DONTNEED is called for an active > page instead of removing it from the page cache it is added to the tail of the > inactive list. Otherwise, if it's already in the inactive list the page is > removed from the page cache. Pages mapped by other processes or unevictable > pages are not touched at all. > > In this way if the backup was the only user of a page, that page will be > immediately removed from the page cache by calling POSIX_FADV_DONTNEED. If the > page was also touched by other processes it'll be moved to the inactive list, > having another chance of being re-added to the working set, or simply reclaimed > when memory is needed. So if an application touches a page twice and then runs POSIX_FADV_DONTNEED, that page will now not be freed. That's a big behaviour change. For many existing users POSIX_FADV_DONTNEED simply doesn't work any more! I'd have thought that adding a new POSIX_FADV_ANDREA would be safer than this. The various POSIX_FADV_foo's are so ill-defined that it was a mistake to ever use them. We should have done something overtly linux-specific and given userspace more explicit and direct pagecache control. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/