Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755115Ab1D0APL (ORCPT ); Tue, 26 Apr 2011 20:15:11 -0400 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:42252 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754588Ab1D0APJ (ORCPT ); Tue, 26 Apr 2011 20:15:09 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AnYEAFFdt015LHHJgWdsb2JhbAClZxUBARYmJYhwvkEOhWgElSSHTw Date: Wed, 27 Apr 2011 10:14:53 +1000 From: Dave Chinner To: Andrea Righi Cc: Andrew Morton , Al Viro , Arnd Bergmann , linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC] [PATCH] drop_pagecache syscall Message-ID: <20110427001453.GD12436@dastard> References: <1303853727-21444-1-git-send-email-andrea@betterlinux.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1303853727-21444-1-git-send-email-andrea@betterlinux.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2499 Lines: 70 On Tue, Apr 26, 2011 at 11:35:27PM +0200, Andrea Righi wrote: > Introduce sys_drop_pagecache() system call to drop the page cache pages of > a single filesystem. > > This new system call takes a file descriptor as argument and drops only > the page cache pages of the file system it references. > > At the moment it is possible to drop page cache pages via > /proc/sys/vm/drop_pagecache or via posix_fadvise(POSIX_FADV_DONTNEED). > > The first method drops the whole page cache while the second can be used > to drop page cache pages of a single file descriptor. But there's not a > simple way to drop all the pages of a filesystem (we could scan all the > file descriptors and use posix_fadvise(), but this solution doesn't scale > very well in some cases). Why not just add a new posix_fadvise() command? e.g. POSIX_FADV_DONTNEED_FS. Simpler than adding a new syscall... > This functionality can be used by all the applications that want to have a > better control over the page cache management (for example to immediately drop > pages that for sure will not be reused in the near future, without calling > posix_fadvise() for all the files they've touched), or to provide a more fine > grained debugging feature usable by the filesystem benchmarks. > > The system call does not require root privileges and it can be called by any > unprivileged application. For example, we can write a userspace tool to run > something like this: > > $ drop-pagecache /path/file_or_dir That's a potential DOS vector, I think. Drop the pagecache in a hard loop on the root fs of a busy server and watch it crawl... > +/* > + * Drop page cache of a single superblock > + */ > +SYSCALL_DEFINE1(drop_pagecache, int, fd) > +{ > + struct file *file; > + struct super_block *sb; > + int fput_needed; > + > + file = fget_light(fd, &fput_needed); > + if (!file) > + return -EBADF; > + sb = file->f_dentry->d_sb; > + > + down_read(&sb->s_umount); > + drop_pagecache_sb(sb, NULL); > + up_read(&sb->s_umount); > + > + fput_light(file, fput_needed); > + return 0; You're holding an open reference to a file/dir on the fs so it can't be unmounted from under you. Hence I don't think you need the s_umount locking. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/