Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752579AbbEKXK2 (ORCPT ); Mon, 11 May 2015 19:10:28 -0400 Received: from imap.thunk.org ([74.207.234.97]:58224 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751623AbbEKXKY (ORCPT ); Mon, 11 May 2015 19:10:24 -0400 Date: Mon, 11 May 2015 19:10:21 -0400 From: "Theodore Ts'o" To: Sage Weil Cc: Trond Myklebust , Dave Chinner , Zach Brown , Alexander Viro , Linux FS-devel Mailing List , Linux Kernel Mailing List , Linux API Mailing List Subject: Re: [PATCH RFC] vfs: add a O_NOMTIME flag Message-ID: <20150511231021.GC14088@thunk.org> Mail-Followup-To: Theodore Ts'o , Sage Weil , Trond Myklebust , Dave Chinner , Zach Brown , Alexander Viro , Linux FS-devel Mailing List , Linux Kernel Mailing List , Linux API Mailing List References: <1430949612-21356-1-git-send-email-zab@redhat.com> <20150507002617.GJ4327@dastard> <20150507172053.GA659@lenny.home.zabbo.net> <20150508221325.GM4327@dastard> <20150511144719.GA14088@thunk.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1554 Lines: 36 On Mon, May 11, 2015 at 09:24:09AM -0700, Sage Weil wrote: > > Let me re-ask the question that I asked last week (and was apparently > > ignored). Why not trying to use the lazytime feature instead of > > pointing a head straight at the application's --- and system > > administrators' --- heads? > > Sorry Ted, I thought I responded already. > > The goal is to avoid inode writeout entirely when we can, and > as I understand it lazytime will still force writeout before the inode > is dropped from the cache. In systems like Ceph in particular, the > IOs can be spread across lots of files, so simply deferring writeout > doesn't always help. Sure, but it would reduce the writeout by orders of magnitude. I can understand if you want to reduce it further, but it might be good enough for your purposes. I considered doing the equivalent of O_NOMTIME for our purposes at $WORK, and our use case is actually not that different from Ceph's (i.e., using a local disk file system to support a cluster file system), and lazytime was (a) something I figured was something I could upstream in good conscience, and (b) was more than good enough for us. Cheers, - Ted P.S. I do agree that if we do need this upstream, requiring a mount option to enable the feature is probably a good compromise. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/